18 Replies Latest reply on Oct 2, 2010 8:05 AM by rowby

    Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps

    rowby Level 1

      Hello all,

       

      If you follow couple of my other recent threads here you will see I have a warning of a failing drive in my 4 tb raid (4 x 1TB raid set.)

       

      Also for an unrelated reason I am also needing to reinstall Windows 7.  (and of course my Adobe Master collection CS5)

       

      I ordered a 5th drive to serve as "Hot Spare" and it will be arriving sometime today from Amazon.

       

      I want to make sure I do the above "final steps" properly, in the correct order.


      I am basing my "final steps" based on Harm's advice on another thread:

       

      >>>

      You can never be too safe with data, but if you follow the hot-spare,  extend raid set and rebuild sequence, it is not necessary. Your data  will be reconsructed automatically. That is the reason you use a  raid3/5.

       

      Your log file is a definite show of a serious  disk problem. Replace it, create a hot-spare, extend your raid set and  rebuild (it should be done automatically) and you are ready to roll

       

       

      >>>>

       

      (As a precaution I have done a backup of my raid, to an external drive)

       

      So once my new hard drive arrives today from Amazon, I will, in the following order:

       

      1)  Replace failing hard drive.  (Do I need to power down the computer to remove the failing hard drive and replace it with the hot spare?  Or can I leave the computer powered on so that the raid software can continue its CONSISTANCY CHECK while II do all of the steps to rebuild the raid  --- see note at bottom of this post...)

      2)  Create hot-spare

      3)  Use the Raid controller software (Areca) to EXTEND HOT SPARE

      4)  Rebuild raid (it should be done automatically)

       

      THEN, after I confirm that the raid is working okay, I will

       

      5)  Reinstall Windows 7.

       

      THe below step is what I am not entirely clear on:

       

      6)  I assume since the raid has already been rebuilt {before I did the windows 7 reinstall) I can do a very straightforward "reconnect" to the already working raid.  In other words, I would reinstall the Areca raid software (using the Areca cd) and identify the raid and windows 7 will recognize the 4 tb raid as my raid drive.

       

      I am tracking the delivery of my new hard drive and it is at least 5 or 6 hours away from arriving from Amazon.

       

      ****  CONSISTANCY NOTE: I the meantime I am having the Areca web based controller software do a CONSISTANCY check of the raid.

       

      It is half way through the check and has found 144 errors so far.   The consistancy check is rather vague about what it is doing.  But can I assume that it is finding the errors and will automatically fix them.

       

      Or do I have to do something to fix the errors once the consistancy check is finished (This is one reason why I would like to avoid powering down during the rebuilding of the raid.)  Or will creating the hot spare, etc interrupt the consistancy check, meaning I must finish the consistancy check completley before rebuilding the raid.

       

      Or should I abort the consistancy check and do that after I reinstall windows.  (If possible I would like to do that so I can get on with the editing of my current project!!)

       

      WHEW!

       

      Rowby

       

      Message was edited by: rowby

        • 1. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
          Harm Millaard Level 7

          What happens when a drive fails in a parity raid?

           

          RAID3:

           

          Raid3 uses a dedicated parity drive. when one drive in the raid fails, it can be a data drive or a parity drive. If a data drive fails and is replaced, the failing data need to be reconstructed from the remaing drives, most notably from the parity drive. If the parity drive fails, all that needs to be done is to reconstruct the parity drive from the original data.

           

          RAID5:

           

          Raid5 used distributed parity, so when one drive fails, both the data and the parity needs to be reconstructed on that disk. It needs to reconstruct the data from the other parity drives and then reconstruct the parity from the other data drives. This carries more overhead than in a raid3 and that is why the rebuilding takes more time than in a raid3.

           

          Reonstructing the drive has no impact on the OS and reinstalling the OS has no impact on the raid. Just make sure that the raid functions properly and you have the McRasid software available for installation.

          • 2. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
            rowby Level 1

            Harm Millaard wrote:

            ]

             

            RAID3:

             

            Raid3 uses a dedicated parity drive. when one drive in the raid fails, it can be a data drive or a parity drive. If a data drive fails and is replaced, the failing data need to be reconstructed from the remaing ...

             

            That clarifies things!   I have a Raid3, and if it turns out that my parity hard drive fails I'll use my external backup to copy the data back

             

            I am ready to start the "hot spare" process. My new hard drive has arrived and I can pick it up.

             

            The Areca web based browser is still doing its Volume check.  I would like to abort it so that I can get started reconstructing my raid.  So far it is at 60 percent and has found 144 errors.  It's taken over 8 hours just to check 60 percent!

             

            Can I skip the Volume check for now and start the hot spare process.  I'll do a new volume check on another day -- so I can finish my video editing project.

             

            Or should I wait the several more hours it is going to take to finish that Volume check.   (FYI I've had the raid for about 4 months and never did a volume check.)

             

             

            Rowby

            • 3. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
              ECBowen Most Valuable Participant

              If you are going to rebuild the array then finish the volume check. If you have your data backed up and you are planning on just deleting the raid volume and configuring a new one then you can cancel it.

               

              Eric

              ADK

              • 4. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                rowby Level 1

                Well since I don't want to rebuid the entire raid then I guess I will have to wait for the volume check to finish.

                 

                But what happens at the end of the volume check?  Are the "Errors" automaticcally fixed?

                 

                Rowby

                • 5. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                  ECBowen Most Valuable Participant

                  Some controllers have a scan only option and a scan and fix. Others have just a parity check option which also attempts to fix the errors. It depends on what your controller has. Most though scan and fix.

                   

                  Eric

                  ADK

                  • 6. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                    Harm Millaard Level 7

                    Rowby,

                     

                    Your volume check is just confirming what you already knew, your array is degraded. You can let it finish or you can stop it. There is no harm in letting it continue, but I doubt there is any benefit in letting it continue.

                     

                    Whichever way you want to go, you need to add the new disk first as a hot-spare to your array. Once that process has completed, you need to expand your array, so your hot-spare becomes a regular disk in the array. That will take some time. Once that process has finished too, I would do a volume check to see if the rebuild was successful. It will be faster too, because the array is no longer degraded.

                    1 person found this helpful
                    • 7. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                      ECBowen Most Valuable Participant

                      WEB BROWSER-BASED CONFIGURATION

                       

                      132

                      6.6.5 Check Volume Set

                       

                      To check a volume set from a RAID set:

                      (1). Click on the “Check Volume Set” link.

                      (2). Click on the volume set from the list that you wish to check.

                      Tick on “Confirm The Operation” and click on the “Submit” button.

                      Use this option to verify the correctness of the redundant data in

                      a volume set. For example, in a system with dedicated parity, volume

                      set check means computing the parity of the data disk drives

                      and comparing the results to the contents of the dedicated parity

                      disk drive. The checking percentage can also be viewed by clicking

                      on “RAID Set Hierarchy” in the main menu.

                       

                       

                      Directly from the Areca 1600 series Manual. I would let that finish.

                       

                       

                      Eric

                      ADK

                      • 8. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                        rowby Level 1

                        Hi Eric,

                         

                        I see that my "Volume rate" is at 81.5%.  Even though it's taking forever, I figure I'm only 2-3 hours away from the checking to finish.  So I'll just wait before adding the hot spare etc.

                         

                        Hopefully whatever I checked when I started it going will fix the errors automatically.  So far it's found 144 errors, which I don't think is too bad for 4 tb raid having its first "Volume State Check"   (This is my first raid, and I suppose in a few months I'll be a raid expert...)

                         

                        The volume that it is checking says:

                         

                        Volume set name:  ARC 1580-VOIL #000

                        Raid set Name:   RaId set #000

                        Volume capacity 3000.GB

                        SCSI Ch/Id/Lun    0/0/0

                        Raid Level Raid 3

                        Stripe Size N./A

                        Block size:  512bytes

                        Member disks   4

                        Cache Mode  Write Back

                        Tagged Queing  Enabled

                        Volume State  Checking (in red)

                        Progress: 81.5%

                        Errors Found: 144

                         

                        Rowby

                        • 9. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                          rowby Level 1

                          The raid testing finished and all is marked as normal.

                           

                          So I assume it fixed any errors.

                           

                          I replaced the failing drive with a new one.

                           

                          And am ready to create the Hot Spare.   I am in the DOS Areca Technology RAID Controller setup.

                           

                          From the drop down I am given the choice

                           

                          SELECT HOT SPARE TYPE

                          Global

                          Dedicated to RaidSet

                          Dedicated to Enclosure.

                           

                          Which one do I choose?  See attached image

                           

                          BTW Do I need to have Windows 7 Format the new Samsung drive before I go through the Hot Spare process????

                           


                          P.S.  While waiting for a response here (I have itchy fingers, wanting to get this done) I went to another forum and one comment was:

                           

                          You can probably do any of those if you want it to replace an already  failed drive.  Those options are usually for when it is still "extra"  and just sitting there as a spare.  When it's there as a spare, you can  dedicate the spare to only be able to be used if something failes in  "the enclosure", "the raidset", or anywhere ("globally").


                          My question would be if you have hot swap capability, why don't you just  swap it with the failed drive?  (Check the docs on that controller)  rather than adding it as a hot spare first.

                           

                          Thanks

                           

                          Rowby

                           

                          Message was edited by: rowby

                          • 10. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                            rowby Level 1

                            Here's the lateset update:

                             

                            Raid Set Information:

                            Raid Set #000 :  3/4 Disks: Incomplete

                             

                            Create Hot Spare

                            Select Drives for HotSpare, Max 3 HotSpare Supported

                            [X]  E2SLOT 10: 1000. 2GB Samsung HD 103SJ

                            Create Hot Spare  Yes/NO

                            I select Yes

                             

                            Select Hot Spare Type

                            > Global

                            > Dedicated to RaidSet

                            > Dedicated to Enclosure

                             

                            I select GLOBAL

                             

                             

                            Then I Select EXPAND RAID SET

                            I Select   "Raid Set #000 :  3/4 Disks: Incomplete"'

                             

                            And I end up with No Device Available for Expansion"

                             

                            ???

                             

                            I have made a little YouTube video which is uploading showing this problem.  Stay tuned for link:

                             

                            Not sure what I need to do to deal with No Device Available for Expansion"

                             

                             

                            Rowby

                            • 13. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                              Harm Millaard Level 7

                              Rowby, look here:

                               

                              2-10-2010 10-46-17.png

                               

                              That is the easiest way.

                              • 14. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                                rowby Level 1

                                Hi Harm,

                                 

                                I'm getting mixed messages from my system.

                                 

                                1)

                                 

                                When the system is booting up I'm seeing

                                 

                                in Dos:

                                 

                                Waiting for Raid controller F/W to become ready

                                 

                                Finally:

                                 

                                NO BIOS Found.  Raid controller bios not installed

                                 

                                If I hit TAB and go into the DOS Raid Setup

                                 

                                In PHysical Drive Function, I see all 4 drives including #10 listed as Hot Spare (Identifying it as a Samsung -- so it is seeing the drive, I assume)

                                 

                                E2SLOT 9 ---- Raid Set Member Samsung

                                E2SLOT 10 --  Hot Spare

                                E2SLOT 11 -- Raid set member

                                EZSLOT 12 -- Raid set member.

                                 

                                First of all, why is it saying No bios found when the DOS raid controller is showing what looks like a raid...

                                 

                                SO I reboot --and this time I will go into the WIndows based Raid Controller

                                 

                                WIndows 7 startup

                                Launch Browser based  Control

                                • 15. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                                  rowby Level 1

                                  More images

                                   

                                  Perhaps the new samsung needs to be formatted first???

                                   


                                  Still wondering why at startup in dos I'm getting the message:

                                   

                                  No bios disk found,

                                   

                                  Raid controller Bios not installed

                                  • 16. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                                    Rowby Goren Level 1

                                    Harm

                                     

                                    My main bewilderment was why I was seeing "NO BIOS Found.  Raid controller bios not installed" at bootup.

                                     

                                    So I finally took out my new hard drive and replaced it with my old one that is apparently going to fail (although I noticed during the 20 hours of Volume check there was not one single time out in the log.  So I am wondering if the hard drive is really failing after all.)

                                     

                                    I rebooted and i no longer got the No BIOS found message.

                                     

                                    And my raid is showing again, full drive etc.,

                                     

                                    The Web based Raidset Hierachy is showing all of my drives as raid sets and the Volume state is Normal in RED.

                                     

                                    So I am thinking that the bios must have been on the hard drive that I replaced with the new one.

                                     

                                     

                                    With all of the above in mind, should I then go back to the failing drive in my raid set and make that as a hot spare -- BEFORE I remove it from the computer????

                                     

                                    After I do that then I can swap it out with the new drive without turning off my computer.

                                     

                                    And THEN expand the raid set?

                                     

                                    Rowby

                                    • 17. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                                      Harm Millaard Level 7

                                      Situation 1
                                      If you made a hotspare, the hotspare will automatically rebuild in the case of a defect disk.

                                      Situation 2
                                      If you don't have a hotspare and a disk fails, you can hotswap a new disk. Rebuild will start immediately on the new disk.

                                      Situation 3
                                      If you turn off the PC and remove the defect disk (cold swap) and insert a new disk and restart the system, the Areca will do nothing..!!

                                      Areca has marked the location of the defect disk as bad and for the rest nothing changed.  In this case you need to tell the Areca via McRAID BIOS or webbased to create the new disk as HOTSPARE, and immediately after that the hotspare will be used as a member of the INCOMPLETE ARRAY to rebuild.

                                       

                                      The missing BIOS message is very strange and I do not have an explanation.

                                      1 person found this helpful
                                      • 18. Re: Raid- Understanding "Hot Spare" and Windows reinstall - Final Steps
                                        rowby Level 1

                                        Hee hee

                                         

                                        It's good to know that with a Raid3 I have so many options.

                                         

                                        For now, since I am not getting any timeouts in the log I am going to keep my current setup and revisit the hot spare and the mystery of the missing bios message next week.

                                         

                                        I should finish my project on Monday and then I can dig more into this.

                                         

                                        I am hoping that the Volume check that found about 155 errors (mainly near the beginning of the check) fixed any bad "sectors" on the "bad" hard drive and I can continue with my work for the next couple of days.

                                         

                                        Yes, I am making backups of my project at least twice a day just in case.

                                         

                                        I think the mystery of the " the missing BIOS message" must have something to do why I was unable to make the new drive a hot spare.

                                         

                                        Once my project is finished I may just do a complete rebuild of my raid configuration if that's what it takes to get the bios to return.

                                         

                                        For now, with my old hard drive back in its slot, I am not getting that missing bios message.


                                        Rowby