[SOLVED] HD 5% crash: accessing HD using Mageia Live USB

This forum is dedicated to basic help and support :

Ask here your questions about basic installation and usage of Mageia. For example you may post here all your questions about getting Mageia isos and installing it, configuring your printer, using your word processor etc.

Try to ask your questions in the right sub-forum with as much details as you can gather. the more precise the question will be, the more likely you are to get a useful answer

[SOLVED] HD 5% crash: accessing HD using Mageia Live USB

Postby Kirsty » May 29th, '14, 16:42

Hello,

I encountered a HD crash. 5% of my HD is crippled. A part of it is used by Mageia 4. When starting up, Mageia 4 doesn't load anymore.
So I used my backup system; a live USB Mageia stick. That works fine, but I would like to acces my HD.

When running in Live mode I don't have access to my HD home directory.

How do I mount my HD home directory using my Live mageia session?

Kirsty.
Last edited by Kirsty on Jun 5th, '14, 09:04, edited 1 time in total.
Kirsty
 
Posts: 21
Joined: May 3rd, '14, 08:56

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby wintpe » May 29th, '14, 17:27

ok

open a command shell in your mageia live environment and issue the following command

thats start -> system-tools -> konsole in kde

su -

now, run fdisk -l

the output will be something like

Code: Select all
Disk /dev/sda: 256.1 GB, 256060514304 bytes, 500118192 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048    25189919    12593936   83  Linux
/dev/sda2        25192440   500103449   237455505    5  Extended
/dev/sda3   *           0           0           0    0  Empty
/dev/sda4   *           0           0           0    0  Empty
/dev/sda5        25192448    33367004     4087278+  82  Linux swap / Solaris
/dev/sda6        33370112   500103449   233366669   83  Linux


now identify which one of the /dev/sdax is the partition you want to mount, or go through the following until you find it.

there exists a directory /mnt on most live cd's

ls -al /mnt

if it exists then

mount /dev/sda6 /mnt

if not
mkdir /mnt
first

you should now be able to open dolphin or similar and browse to /mnt and look to see if that is the partition that has your files.

regards peter

ps im adding this footnote to cover the eventuality that you disks problems may effect your home area.
if the badblocks on the harddisk have effected your home area the file system may become corrupt.

it is possible using the program badblocks to run a read analyse of your harddisk forcing it to remap the bad sectors to the bad sector table.

when it remaps them, it cant always recover the data, leaving your file system corrupt.

the fsck command can be run on your filesystem afterwards to bring part of it back to some sanity.

so steps to run badblocks is

badblocks /dev/sda

this is the read by default analyse

and it will report badblocks its found and force the hdd to try to remap.

once commpleted a second run should idealy show no badblocks

then each filesystem/mount point will need to be checked with the check facility native to the filesystem, so mageia thats proably fsck.ext4.
windows thats chkdsk

and so on.

hope that helps

regards peter
Redhat 6 Certified Engineer (RHCE)
Sometimes my posts will sound short, or snappy, however its realy not my intention to offend, so accept my apologies in advance.
wintpe
 
Posts: 1204
Joined: May 22nd, '11, 17:08
Location: Rayleigh,, Essex , UK

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby doktor5000 » May 30th, '14, 00:05

Kirsty wrote:I encountered a HD crash. 5% of my HD is crippled.

How exactly did you encounter it? By crash, what do you mean exactly? What do you mean by crippled, and how did you check that?
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18067
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby Kirsty » May 30th, '14, 09:52

Thanks wintpe and doctor500.

I established a HD crash using Ubuntu. When starting up, the system gave a lot of numbers and inode blocks that were fault or crippled.
I use crippled as a word to indicate that there are bad blocks on my HD.

I changed my system to Mageia. All my data is saved, synced to an external HD.

The mounting proces is a succes but I can't acces my home directory. I don't have the rights as a live-user.

I did not know I could do a command 'badblocks'.

I did

su -
badblocks /dev/sda


but then nothing happens on my terminal.

Then when starting up I got a new screen:

I get

(Repari:/#

next I type
exit

and I get all kind of start up processes, it gives errors on the LBE process.

Finally Mageai 4 starts up as usual, but it is slower.

Is there a specific program that can help me, relocate bad blocks?

Kirsty
Kirsty
 
Posts: 21
Joined: May 3rd, '14, 08:56

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby Kirsty » May 30th, '14, 11:17

Just checked the terminal with the badblocks command and it shows lots of numbers:
Kirsty
 
Posts: 21
Joined: May 3rd, '14, 08:56

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby doktor5000 » May 30th, '14, 11:47

Kirsty wrote:I established a HD crash using Ubuntu. When starting up, the system gave a lot of numbers and inode blocks that were fault or crippled.
I use crippled as a word to indicate that there are bad blocks on my HD.

Then please say so directly next time, the words are pretty important as that's the only thing we have to help you remotely.
Crippled can mean basically three different things:
- HDD physically damaged (headcrash, or spindle motor broken or something the like) - worst case
- HDD logically damaged (partition table broken, filesystem or LVM or RAID damaged)
- HDD physically-logically damaged - bad block relocation already started or is in effect

Kirsty wrote:Just checked the terminal with the badblocks command and it shows lots of numbers:

Yes, those are the adresses of the bad blocks. Usually you want to write them to a file, and then afterwards hand them over when creating a new filesystem.

This is what I'd use:
Code: Select all
badblocks /dev/sda | tee bad-blocks_sda

Be aware that such a scan can take several hours for a complete disk, depending on the speed.
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18067
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby Kirsty » May 31st, '14, 09:31

Indeed, in my case the first hour there appear no bad blocks, but then, olala, the numbers keep coming.
It takes several hours to perform the scan.

How to establish a new filesystem?

Or should I post this question in a new post?

Kirsty
Kirsty
 
Posts: 21
Joined: May 3rd, '14, 08:56

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby doktor5000 » May 31st, '14, 11:14

No need to create a new thread, as creating a new filesystem with this badblocks list is in the scope of this thread.

But first some explanation so you get a better understanding:
What badblocks does is scan the drive for bad blocks and put them on a list. You can use that list when creating a new filesystem, to avoid writing to those known bad blocks.
Nevertheless, this will only workaround the problem. Best thing is usually to use the vendor diagnostic tool to write over the complete harddisk, that is kind of low-level format. It will write to each block, and when it encounters a bad block, normally the drive recognises that, marks the block "do not use" and instead points to a block in it's reserve pool. This effectively "fixes" the bad block issue if the drive is still working properly and has reserve sectors available.

What you still need to keep in mind: You shouldn't use that drive anymore for important data. The drive could fall apart anytime in the worst case, or the bad blocks could dramatically increase. I've seen cases where that didn't happen and people have used such disks with some bad blocks for years.
That's your call.

Before we continue further, please also post the output as root of
Code: Select all
smartctl --all /dev/sda

You may need to install smartmontools package for that.
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18067
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby Kirsty » Jun 1st, '14, 10:05

output of smartmoncontrol:
*******************************

Code: Select all
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.12.20-desktop-1.mga4] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green (AF)
Device Model:     WDC WD10EARS-00Z5B1
Serial Number:    WD-WMAVU1522465
LU WWN Device Id: 5 0014ee 0573f9516
Firmware Version: 80.00A80
User Capacity:    1.000.204.886.016 bytes [1,00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sun Jun  1 10:03:55 2014 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)   Offline data collection activity
               was suspended by an interrupting command from host.
               Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)   The previous self-test routine completed
               without error or no self-test has ever
               been run.
Total time to complete Offline
data collection:       (28500) seconds.
Offline data collection
capabilities:           (0x7b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               General Purpose Logging supported.
Short self-test routine
recommended polling time:     (   2) minutes.
Extended self-test routine
recommended polling time:     ( 326) minutes.
Conveyance self-test routine
recommended polling time:     (   5) minutes.
SCT capabilities:           (0x3031)   SCT Status supported.
               SCT Feature Control supported.
               SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   192   189   051    Pre-fail  Always       -       43564
  3 Spin_Up_Time            0x0027   179   175   021    Pre-fail  Always       -       6025
  4 Start_Stop_Count        0x0032   094   094   000    Old_age   Always       -       6112
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   093   093   000    Old_age   Always       -       5507
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   094   094   000    Old_age   Always       -       6053
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       79
193 Load_Cycle_Count        0x0032   167   167   000    Old_age   Always       -       100282
194 Temperature_Celsius     0x0022   119   107   000    Old_age   Always       -       31
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   194   192   000    Old_age   Always       -       1098
198 Offline_Uncorrectable   0x0030   197   195   000    Old_age   Offline      -       618
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   196   194   000    Old_age   Offline      -       690

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Last edited by doktor5000 on Jun 1st, '14, 11:56, edited 1 time in total.
Reason: added code tags
Kirsty
 
Posts: 21
Joined: May 3rd, '14, 08:56

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby jiml8 » Jun 1st, '14, 20:12

That drive is failing. I would advise you to immediately scroll all your data off of it and onto another drive, then turn that drive into a paperweight. Based upon its powered on time, you probably still have a warranty for it; replace it.

Your power cycles are very high; you turn this computer on and off a lot?

As a final comment, you should not use a WD Green drive for a system drive, period. You also should not use a WD Green drive with a Linux system unless you use the WD utility (don't recall the name offhand) that lets you change the wait time before parking the heads. Your load cycle count is very high, which is typical of one of these drives on a Linux system, and the number of lifetime load cycles for one of these drives is rated at about 500K.
jiml8
 
Posts: 1254
Joined: Jul 7th, '13, 18:09

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby doktor5000 » Jun 1st, '14, 23:37

jiml8 wrote:You also should not use a WD Green drive with a Linux system unless you use the WD utility (don't recall the name offhand) that lets you change the wait time before parking the heads.

It's called wdidle3 by WD and there's also http://idle3-tools.sourceforge.net/
Check http://jeanbruenn.info/2011/01/23/wd-gr ... cle-count/ or viewtopic.php?p=14468#p14468

FWIW, I've got one of those as external drive since 4 years and it works without issues. But I've also disabled the wait time shortly after I've bought it.
Wouldn't recommend it again, better get a WD Red or Black for home use or one of the better drives if you require more reliability.
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18067
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby jiml8 » Jun 2nd, '14, 04:39

I also have a 4 year old WD 2TB Green drive in my workstation. It is the only non-scsi drive in the system, and I use it as a storage drive, which is its intended purpose. About a year ago, I ran that WD utility on the drive to stop its head parking. I seem to recall trying to do that much earlier but had some problem or other, did not succeed, and did not worry about it. Now, as a 4 year old, the load cycle count is about 435K. When I ran the utility, IIRC, it was about 433K.

The drive is working OK, though I am starting to monitor it closely. It consistently reports 15 unreadable (pending) sectors and running badblocks does not influence that. I have not made much of an issue of it, though I probably should.

I evaluate my drive as being OK, though not perfect and possibly in the very early stages of failure - though it appears stable.

For comparison with the OP's drive, here is my smartctl output - and I reiterate, I am not totally happy with this:
Code: Select all
root@dadsbox:jiml> smartctl -a /dev/sda
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.12.20-desktop-1.mga4] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green (AF)
Device Model:     WDC WD20EARS-00MVWB0
Serial Number:    WD-WMAZ20018291
LU WWN Device Id: 5 0014ee 6002524bd
Firmware Version: 50.0AB50
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sun Jun  1 19:35:16 2014 MST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
                                        was suspended by an interrupting command from host.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (36600) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 417) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x3035) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   199   198   051    Pre-fail  Always       -       149
  3 Spin_Up_Time            0x0027   158   156   021    Pre-fail  Always       -       7091
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       160
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   055   055   000    Old_age   Always       -       33216
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       158
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       123
193 Load_Cycle_Count        0x0032   055   055   000    Old_age   Always       -       435786
194 Temperature_Celsius     0x0022   115   104   000    Old_age   Always       -       35
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       15
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   199   199   000    Old_age   Offline      -       275

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     33182         -
# 2  Extended offline    Completed without error       00%     33015         -
# 3  Extended offline    Completed without error       00%     32847         -
# 4  Extended offline    Completed without error       00%     32680         -
# 5  Extended offline    Completed without error       00%     32512         -
# 6  Extended offline    Completed without error       00%     32344         -
# 7  Extended offline    Completed without error       00%     32176         -
# 8  Extended offline    Completed without error       00%     32009         -
# 9  Extended offline    Completed without error       00%     31841         -
#10  Extended offline    Completed without error       00%     31673         -
#11  Short offline       Completed without error       00%     31518         -
#12  Extended offline    Completed without error       00%     31506         -
#13  Extended offline    Completed without error       00%     31338         -
#14  Extended offline    Completed without error       00%     31170         -
#15  Extended offline    Completed without error       00%     31002         -
#16  Extended offline    Completed without error       00%     30834         -
#17  Extended offline    Completed without error       00%     30667         -
#18  Extended offline    Completed without error       00%     30499         -
#19  Extended offline    Completed without error       00%     30331         -
#20  Extended offline    Completed without error       00%     30164         -
#21  Extended offline    Completed without error       00%     29996         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Note that this is still a good drive, but the pending sectors is a bit worrisome, as is the multi-zone error rate. Compare these values with OP's drive - which quite clearly is failing.
jiml8
 
Posts: 1254
Joined: Jul 7th, '13, 18:09

Re: HD 5% crash: accessing HD using Mageia Live USB

Postby Kirsty » Jun 5th, '14, 09:03

Hardware is not my thing, but I'll try to replace my HD myself.

Thanks everybody for the help, it was useful and informative.
Kirsty
 
Posts: 21
Joined: May 3rd, '14, 08:56


Return to Basic support

Who is online

Users browsing this forum: No registered users and 1 guest