System boots with Read Only file systems

This forum is dedicated to basic help and support :

Ask here your questions about basic installation and usage of Mageia. For example you may post here all your questions about getting Mageia isos and installing it, configuring your printer, using your word processor etc.

Try to ask your questions in the right sub-forum with as much details as you can gather. the more precise the question will be, the more likely you are to get a useful answer

System boots with Read Only file systems

Postby gregM » Jun 8th, '18, 05:55

Hi All,
A couple of days ago, I could not login to the desktop (KDE Plasma5).

Found there was an issue with some disk partitions not being mounted (home in particular) and others being mounted in read-only mode.
My reading has indicated there may be a failing hard drive, so a new one is on the way.

I booted with the CD/DVD Mageia6 Installer ISO to repair mode, and have run fsck on all the partitions and they are now all reporting as clean (even with the -f force check option), but the normal boot still results in read-only / and /usr and /home and another backup partition not being mounted at all.

I use 'less /proc/mounts |grep sda' to see the mounted partitions and the status for / and /usr are marked 'ro' and my home and backup partitions are not mounted.

So while waiting for a new drive I would like to see why the system is thinking there is an error or problem. Where to look ?

I have used 'mount /dev/sda1 -o remount,rw /' and 'mount /dev/sda6 -o remount,rw /usr' to mount the read-only partitions and then mount -a to mount the fstab defined partitions and all is working fine.

Have completed backup of the home partition where all my user data lives, so when the new disk arrives I would like to install a nice clean system then migrate whatever data in my home to the new system. To this end I wonder when a Mageia 6.1 installer.ISO might be available with the mega updates on it.....

What I would like to know is where I can find why this is happening. Have checked dmesg, /var/log/messages, /var/log/kernel/*.log but cant see any thing that seems to be a cause or status of the failure.

Any clues ??
gregM
 
Posts: 21
Joined: Jan 16th, '18, 01:34

Re: System boots with Read Only file systems

Postby morgano » Jun 8th, '18, 08:07

Did this happen after you updated kernel?
Anyways, as a test try booting on an elder kernel.
There have been various issues with kernels because of the meltdown/spectre mitigations and they are developing with each kernel since about new year, and there have been problems with some hardware combinations.
Mandriva since 2006, then Mageia since 2011 at home & work. Thinkpad T40 T42p T43 T60 T61 T400. Aspire 7. Fileserver. Workstation using LVM, LUKS, VirtualBox, BOINC, Dropbox, CAD, urpmi-proxy, NextCloud...
morgano
 
Posts: 538
Joined: Jun 15th, '11, 17:51
Location: Kivik, Sweden

Re: System boots with Read Only file systems

Postby gregM » Jun 8th, '18, 08:46

I had tried an earlier kernel to see if that helped, before but no.
I have just tried going back from 4.14.44-server-2 to 4.14.25-server-1, so now have tried 3 or 4 different kernels but no change....
All the earlier kernels did not give this condition before, so I'm thinking the disk may be on the way out...!

I read that some hardware faults happen before the system is able to log a reason why ....
Thanks for the reply.
gregM
 
Posts: 21
Joined: Jan 16th, '18, 01:34

Re: System boots with Read Only file systems

Postby wintpe » Jun 8th, '18, 12:48

ext4 file systems will go read only if they are suffering write eroors.

this is to protect the data from further corruption, allowing you to take corrective action.

try running badblocks on your drive, by default it goes for a read test, that might not be enough,

you may need to offline the disk and use its write test.

and that will overwrite your data so only use it once you are up and running on a new disk

what about errors in /var/log/messages.?

regards peter
Redhat 6 Certified Engineer (RHCE)
Sometimes my posts will sound short, or snappy, however its realy not my intention to offend, so accept my apologies in advance.
wintpe
 
Posts: 1168
Joined: May 22nd, '11, 17:08
Location: Rayleigh,, Essex , UK

Re: System boots with Read Only file systems

Postby morgano » Jun 8th, '18, 20:04

Yes if it is read only it can not save log to disk...
Most important now is to make sure you have a backup of your important data.
Maybe boot an an live disk to perform backup.
Then check the S.M.A.R.T status of the disk to see if it knows about any problem within itself.
Mandriva since 2006, then Mageia since 2011 at home & work. Thinkpad T40 T42p T43 T60 T61 T400. Aspire 7. Fileserver. Workstation using LVM, LUKS, VirtualBox, BOINC, Dropbox, CAD, urpmi-proxy, NextCloud...
morgano
 
Posts: 538
Joined: Jun 15th, '11, 17:51
Location: Kivik, Sweden

Re: System boots with Read Only file systems

Postby gregM » Jun 9th, '18, 01:28

Thanks for hint to try SMART diag.
Here are the error results of smartctl --all /dev/sda

There are 2 errors reported... not really sure what they are telling me as yet :)

Code: Select all
Error 2 occurred at disk power-on lifetime: 37482 hours (1561 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 28 eb 84 e4  Error: UNC 8 sectors at LBA = 0x0484eb28 = 75819816

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 28 eb 84 e4 08      00:22:42.031  READ DMA
  c8 00 08 d0 1b 86 e4 08      00:22:42.031  READ DMA
  ea 00 00 97 b4 6b e0 08      00:22:42.031  FLUSH CACHE EXT
  c8 00 08 90 b4 6b ea 08      00:22:42.031  READ DMA
  ca 00 08 08 c5 6c ea 08      00:22:42.031  WRITE DMA

Error 1 occurred at disk power-on lifetime: 37482 hours (1561 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 d8 28 eb 84 e4  Error: UNC 216 sectors at LBA = 0x0484eb28 = 75819816

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 f0 10 eb 84 e4 08      00:22:42.030  READ DMA
  c8 00 08 08 eb 84 e4 08      00:22:42.030  READ DMA
  c8 00 08 00 eb 84 e4 08      00:22:42.030  READ DMA
  c8 00 08 b0 1c 46 e4 08      00:22:42.030  READ DMA
  c8 00 f8 10 f2 44 e4 08      00:22:42.030  READ DMA


This is the full output :-
Results of smartctl --all /dev/sda
Code: Select all
[root@newport greg]# smartctl --all /dev/sda
smartctl 6.5 2016-05-07 r4318 [i686-linux-4.14.44-server-2.mga6] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint F3
Device Model:     SAMSUNG HD103SJ
Serial Number:    S246J9CZ913936
LU WWN Device Id: 5 0024e9 203663eb1
Firmware Version: 1AJ10001
User Capacity:    1,000,203,804,160 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sat Jun  9 09:06:05 2018 AEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                ( 9420) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 157) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       93
  2 Throughput_Performance  0x0026   252   252   000    Old_age   Always       -       0
  3 Spin_Up_Time            0x0023   070   069   025    Pre-fail  Always       -       9318
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       500
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       42194
 10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   252   252   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       541
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       8
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   064   051   000    Old_age   Always       -       29 (Min/Max 9/49)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   252   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       75
223 Load_Retry_Count        0x0032   252   252   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       542

SMART Error Log Version: 1
ATA Error Count: 2
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 occurred at disk power-on lifetime: 37482 hours (1561 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 28 eb 84 e4  Error: UNC 8 sectors at LBA = 0x0484eb28 = 75819816

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 28 eb 84 e4 08      00:22:42.031  READ DMA
  c8 00 08 d0 1b 86 e4 08      00:22:42.031  READ DMA
  ea 00 00 97 b4 6b e0 08      00:22:42.031  FLUSH CACHE EXT
  c8 00 08 90 b4 6b ea 08      00:22:42.031  READ DMA
  ca 00 08 08 c5 6c ea 08      00:22:42.031  WRITE DMA

Error 1 occurred at disk power-on lifetime: 37482 hours (1561 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 d8 28 eb 84 e4  Error: UNC 216 sectors at LBA = 0x0484eb28 = 75819816

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 f0 10 eb 84 e4 08      00:22:42.030  READ DMA
  c8 00 08 08 eb 84 e4 08      00:22:42.030  READ DMA
  c8 00 08 00 eb 84 e4 08      00:22:42.030  READ DMA
  c8 00 08 b0 1c 46 e4 08      00:22:42.030  READ DMA
  c8 00 f8 10 f2 44 e4 08      00:22:42.030  READ DMA

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Completed [00% left] (0-65535)
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

gregM
 
Posts: 21
Joined: Jan 16th, '18, 01:34

Re: System boots with Read Only file systems

Postby gregM » Jun 9th, '18, 04:40

Looks like I needed to also do a long 'self test' to get more information from SMART...

The SMART attributes do not record any failures it seems ...

So maybe the drive is not as sick as I suspect it is, or the reason for the system to open a few partitions in read-only mode ....
The only errors seem to be :-
Error: UNC 8 sectors at LBA = 0x0484eb28 = 75819816
Error: UNC 216 sectors at LBA = 0x0484eb28 = 75819816

Code: Select all
First
 smartctl -t long /dev/sda
then a few hours later..
 smartctl -all /dev/sda
 

smartctl 6.5 2016-05-07 r4318 [i686-linux-4.14.44-server-2.mga6] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint F3
Device Model:     SAMSUNG HD103SJ
Serial Number:    S246J9CZ913936
LU WWN Device Id: 5 0024e9 203663eb1
Firmware Version: 1AJ10001
User Capacity:    1,000,203,804,160 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sat Jun  9 12:17:28 2018 AEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)   Offline data collection activity
               was never started.
               Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)   The previous self-test routine completed
               without error or no self-test has ever
               been run.
Total time to complete Offline
data collection:       ( 9420) seconds.
Offline data collection
capabilities:           (0x5b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               No Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               General Purpose Logging supported.
Short self-test routine
recommended polling time:     (   2) minutes.
Extended self-test routine
recommended polling time:     ( 157) minutes.
SCT capabilities:           (0x003f)   SCT Status supported.
               SCT Error Recovery Control supported.
               SCT Feature Control supported.
               SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       93
  2 Throughput_Performance  0x0026   055   055   000    Old_age   Always       -       8681
  3 Spin_Up_Time            0x0023   070   069   025    Pre-fail  Always       -       9318
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       500
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       42197
 10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   252   252   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       541
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       8
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   064   051   000    Old_age   Always       -       32 (Min/Max 9/49)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   252   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       75
223 Load_Retry_Count        0x0032   252   252   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       542

SMART Error Log Version: 1
ATA Error Count: 2
   CR = Command Register [HEX]
   FR = Features Register [HEX]
   SC = Sector Count Register [HEX]
   SN = Sector Number Register [HEX]
   CL = Cylinder Low Register [HEX]
   CH = Cylinder High Register [HEX]
   DH = Device/Head Register [HEX]
   DC = Device Command Register [HEX]
   ER = Error register [HEX]
   ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 occurred at disk power-on lifetime: 37482 hours (1561 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 28 eb 84 e4  Error: UNC 8 sectors at LBA = 0x0484eb28 = 75819816

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 28 eb 84 e4 08      00:22:42.031  READ DMA
  c8 00 08 d0 1b 86 e4 08      00:22:42.031  READ DMA
  ea 00 00 97 b4 6b e0 08      00:22:42.031  FLUSH CACHE EXT
  c8 00 08 90 b4 6b ea 08      00:22:42.031  READ DMA
  ca 00 08 08 c5 6c ea 08      00:22:42.031  WRITE DMA

Error 1 occurred at disk power-on lifetime: 37482 hours (1561 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 d8 28 eb 84 e4  Error: UNC 216 sectors at LBA = 0x0484eb28 = 75819816

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 f0 10 eb 84 e4 08      00:22:42.030  READ DMA
  c8 00 08 08 eb 84 e4 08      00:22:42.030  READ DMA
  c8 00 08 00 eb 84 e4 08      00:22:42.030  READ DMA
  c8 00 08 b0 1c 46 e4 08      00:22:42.030  READ DMA
  c8 00 f8 10 f2 44 e4 08      00:22:42.030  READ DMA

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     42197         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Completed [00% left] (0-65535)
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

gregM
 
Posts: 21
Joined: Jan 16th, '18, 01:34

Re: System boots with Read Only file systems

Postby gregM » Jun 9th, '18, 09:52

I now have again booted to 'repair' mode and run fsck.ext4 -cv /dev/sda* on all the partitions.
This was to include the badblocks check and update the badblocks inode(s) but no badblocks were found.

I followed this with a force check by adding the -f switch on the fsck check but no errors were found.

Reboot still has the same results 2 read-only partitions and 2 unmounted partitions. Remount to read-write mode and mount the unmounted ones and I can login and scratch my head and wonder...

If anyone can give any insight in the SMART errors and Attributes from the previous post, it would be appreciated.

I note the previous hint the read only check for bad blocks may not be enough, and may need to do a read/write badblocks check, which will require the disk contents to be overwritten.

Backup is all completed, so will wait for new drive to arrive, and see how that goes.
Anyone have any clue if Mageia 6.1 ISO is any closer to publication..?
gregM
 
Posts: 21
Joined: Jan 16th, '18, 01:34

Re: System boots with Read Only file systems

Postby filip » Jun 9th, '18, 15:26

Regarding disk errors I found gsmartcontrol very handy.
filip
 
Posts: 401
Joined: May 4th, '11, 22:10
Location: Kranj, Slovenia

Re: System boots with Read Only file systems

Postby doktor5000 » Jun 9th, '18, 15:34

Well he already ran smartctl directly - gsmartcontrol is only a GUI for that.

gregM wrote:Reboot still has the same results 2 read-only partitions and 2 unmounted partitions. Remount to read-write mode and mount the unmounted ones and I can login and scratch my head and wonder...

If anyone can give any insight in the SMART errors and Attributes from the previous post, it would be appreciated.


The UNC errors are uncorrectable read errors, but judging by the time they occured vs. Power_On_Hours they happened ~ 200 days ago. Otherwise the counters look fine.
badblocks will basically not find any hardware defects, but only blocks where error management of the drive already marked it as bad. Only safe way to know would be to run the vendors own tools against the drive and do a full lowlevel overwrite, then try again.

But I suspect the reason is different, best attach some USB drive or if you have another partition on another drive that you can write to.
Boot regularly, open a terminal and run
Code: Select all
journalctl -ab > /path/to/journal.log
where /path/to/journal.log would be writing to some other place outside that disk.
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 14468
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: System boots with Read Only file systems

Postby gregM » Jun 10th, '18, 01:34

doktor5000 wrote:Boot regularly, open a terminal and run
Code: Select all
journalctl -ab > /path/to/journal.log
where /path/to/journal.log would be writing to some other place outside that disk.


Thanks for that hint.
The /var partition is getting mounted rw at boot, so logs are being written.

I see this in the journclt -ab output that shows the read-only partitions getting mounted in that mode....
Jun 09 17:32:20 newport.nwt.net.au dracut: Mounting /dev/disk/by-uuid/21e67610-e9ad-463e-a518-1a0da236e386 with -o rw,relatime,data=ordered,ro
Jun 09 17:32:20 newport.nwt.net.au dracut: Mounting /usr with -o relatime,acl,ro

Strange that the / partition is showing both rw and ro on that mounting command, and the /usr partition showing mount as ro ....
fstab has no such instruction ... So where is that kernel instruction coming from..

Code: Select all
FSTAB snip
# Entry for /dev/sda1 :
UUID=21e67610-e9ad-463e-a518-1a0da236e386 / ext4 relatime,acl 1 1
# Entry for /dev/sda6 :
UUID=4a8129e3-c8e8-45f3-beec-2b8d2f36773b /usr ext4 relatime,acl 1 2

JOURNALCTL snip
Jun 09 17:32:20 newport.nwt.net.au kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: data=ordered
Jun 09 17:32:20 newport.nwt.net.au dracut: Checking ext4: /dev/disk/by-uuid/21e67610-e9ad-463e-a518-1a0da236e386
Jun 09 17:32:20 newport.nwt.net.au dracut: issuing e2fsck -a  /dev/disk/by-uuid/21e67610-e9ad-463e-a518-1a0da236e386
Jun 09 17:32:20 newport.nwt.net.au dracut: /dev/disk/by-uuid/21e67610-e9ad-463e-a518-1a0da236e386: clean, 6308/512064 files, 450756/2046208 blocks
Jun 09 17:32:20 newport.nwt.net.au dracut: Mounting /dev/disk/by-uuid/21e67610-e9ad-463e-a518-1a0da236e386 with -o rw,relatime,data=ordered,ro
Jun 09 17:32:20 newport.nwt.net.au kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: data=ordered
Jun 09 17:32:20 newport.nwt.net.au dracut: Mounted root filesystem /dev/sda1
Jun 09 17:32:20 newport.nwt.net.au dracut: Checking ext4: /dev/disk/by-uuid/4a8129e3-c8e8-45f3-beec-2b8d2f36773b
Jun 09 17:32:20 newport.nwt.net.au dracut: issuing e2fsck -a  /dev/disk/by-uuid/4a8129e3-c8e8-45f3-beec-2b8d2f36773b
Jun 09 17:32:20 newport.nwt.net.au dracut: /dev/disk/by-uuid/4a8129e3-c8e8-45f3-beec-2b8d2f36773b: clean, 1454392/4481024 files, 9133896/17918066 blocks
Jun 09 17:32:20 newport.nwt.net.au dracut: Mounting /usr with -o relatime,acl,ro
Jun 09 17:32:20 newport.nwt.net.au kernel: EXT4-fs (sda6): mounted filesystem with ordered data mode. Opts: acl
gregM
 
Posts: 21
Joined: Jan 16th, '18, 01:34

Re: System boots with Read Only file systems

Postby morgano » Jun 10th, '18, 06:18

For reference, snip from my output of "journalctl -b | grep dracut", (this system use LVM, therefor the /dev/vg-mga etc)
Code: Select all
jun 08 08:18:54 svarten dracut: issuing e2fsck -a  /dev/vg-mga/root
jun 08 08:18:54 svarten dracut: /dev/vg-mga/root: clean, 634285/2235840 files, 6057326/8977408 blocks
jun 08 08:18:54 svarten dracut: Mounting /dev/vg-mga/root with -o rw,noatime,data=ordered
jun 08 08:18:54 svarten dracut: Mounted root filesystem /dev/mapper/vg--mga-root
jun 08 08:18:54 svarten dracut: Switching root
Mandriva since 2006, then Mageia since 2011 at home & work. Thinkpad T40 T42p T43 T60 T61 T400. Aspire 7. Fileserver. Workstation using LVM, LUKS, VirtualBox, BOINC, Dropbox, CAD, urpmi-proxy, NextCloud...
morgano
 
Posts: 538
Joined: Jun 15th, '11, 17:51
Location: Kivik, Sweden

Re: System boots with Read Only file systems

Postby doktor5000 » Jun 10th, '18, 15:32

Please, attach the whole journal log and not only such a small snip without context information. I can't help further with only that information.
If that is during the initrd phase then readonly would basically be normal behaviour before pivotroot has been performed. But without knowing boot options used (cmdline) can't tell anything about the context.
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 14468
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: System boots with Read Only file systems

Postby gregM » Jun 11th, '18, 01:56

I have also included syslog in case there is anything there.


Oh I see I cant post that big file as code.

Will an attachment work..
Here is the output from journalctl -ab ....
journal.log
(170.96 KiB) Downloaded 6 times


and Syslog
syslog.txt
Syslog
(71.58 KiB) Downloaded 4 times
gregM
 
Posts: 21
Joined: Jan 16th, '18, 01:34

Re: System boots with Read Only file systems

Postby doktor5000 » Jun 11th, '18, 19:21

Du solltest mal via drakboot deine Boot-Optionen anschauen ...

Jun 11 09:09:33 newport.nwt.net.au kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.14.44-server-2.mga6 root=UUID=21e67610-e9ad-463e-a518-1a0da236e386 ro splash quiet noiswmd resume=UUID=9082ca25-cf93-4436-9d78-58c7f25d0a21 audit=0 vga=791


Das ro sollte da m.E. net stehen, hab grad allerdings nix zum vergleichen da. Aber lt. https://github.com/torvalds/linux/blob/ ... .txt#L3829
sorgt das genau dafür dass das root-Device readonly gemountet wird.
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 14468
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: System boots with Read Only file systems

Postby gregM » Jun 12th, '18, 02:01

Checking drakboot these are the Append attributes showing there.....
splash quiet noiswmd resume=UUID=9082ca25-cf93-4436-9d78-58c7f25d0a21 audit=0
No 'ro' to be seen there....

This is the grub2 config... 'less /etc/default/grub'

GRUB_CMDLINE_LINUX_DEFAULT="splash quiet noiswmd resume=UUID=9082ca25-cf93-4436-9d78-58c7f25d0a21 audit=0 vga=791"
GRUB_DEFAULT=saved
GRUB_DISABLE_OS_PROBER=false
GRUB_DISABLE_RECOVERY=false
GRUB_DISABLE_SUBMENU=n
GRUB_DISTRIBUTOR=Mageia
GRUB_ENABLE_CRYPTODISK=y
GRUB_GFXMODE=1024x768x32
GRUB_GFXPAYLOAD_LINUX=text
GRUB_SAVEDEFAULT=true
GRUB_TERMINAL_OUTPUT=gfxterm
GRUB_TIMEOUT=10

No read-only in those defaults.

At boot when I edit the boot options from the grub2 splash screen, the 'ro' attribute is in the default command line, and editing/removing that reference at boot time will allow the 2 partitions (sda1,sda6) that were starting read-only to mount in 'rw' mode ... The other 2 partitions (sda8,sda10) that were not being mounted still need to be manually mounted.

So grub default boot command has 'ro' being set. less /boot/grub2/grub.cfg

linux16 /boot/vmlinuz-4.14.44-server-2.mga6 root=UUID=21e67610-e9ad-463e-a518-1a0da236e386 ro splash quiet noiswmd resume=UUID=9082ca25-cf93-4436-9d78-58c7f25d0a21 audit=0 vga=791

It may be Grub is forcing the read-only mode not the kernel detecting a disk error.
Is there some GRUB_SETTING I need to add or edit in /etc/default/grub ...?
gregM
 
Posts: 21
Joined: Jan 16th, '18, 01:34

Re: System boots with Read Only file systems

Postby gregM » Jun 12th, '18, 13:23

Have now tried re-setting the boot loader.
In drakboot I tried lilo and the boot options were no longer read-only.
However 2 partitions still were not mounted at logon/boot and needed to be manually mounted.

Went back to using Grub2
Added the rw option to /etc/defaults/grub which resulted in both the ro and rw options being set
Going back to grub2 the /boot/grub2/grub.cfg had the 'ro' option set in all the kernel images.

I removed that 'ro' from each (non-recovery) kernel image in the grub.cfg, (ignoring the advice not to edit that file) and ran drakboot again where the 'ro' option no longer appeared in the predefined kernel image options.
Now I get a read-write / file system and a read-write /usr file system (sda1,sda6)

That seemed to fix the 'ro' being written to the fresh grub.cfg after drakboot .. Like the settings in grub.cfg was being read and included in the freshly generated grub.cfg when drakboot was run.

Now I just have the issue of not all partitions being mounted at boot, /home and /data dont auto mount for some reason, but will mount happily with 'mount -a'

I added the 'defaults' option to the fstab for those 2 slices not mounted, but no change.
Here is the edited fstab, that file has been unchanges for many months, so I doubt it is the cause of that particular issue.

Thanks doktor5000 for the hints to look at the boot setup....

Code: Select all
# Entry for /dev/sda1 :
UUID=21e67610-e9ad-463e-a518-1a0da236e386 / ext4 relatime,acl 1 1
# Entry for /dev/sda10 :
UUID=2c834b22-386b-4a59-87c5-1f880f712867 /data ext4 defaults,acl,relatime 1 2
# Entry for /dev/sda8 :
UUID=20f36172-74ff-483c-8735-47bcfa7b200d /home ext4 defaults,relatime,acl 1 2
## //homeport/Data /home/greg/Data cifs user,credentials=/etc/samba/auth.homeport.greg 0 0
## none /proc proc defaults 0 0
# Entry for /dev/sda9 :
UUID=6562c9ba-2783-4dad-bbb8-8f1ffd96086d /tmp ext4 acl,relatime 1 2
# Entry for /dev/sda6 :
UUID=4a8129e3-c8e8-45f3-beec-2b8d2f36773b /usr ext4 relatime,acl 1 2
# Entry for /dev/sda7 :
UUID=286d0708-9754-4b78-981d-94b018709108 /var ext4 acl,relatime 1 2
# Entry for /dev/sda5 :
UUID=9082ca25-cf93-4436-9d78-58c7f25d0a21 swap swap defaults 0 0
gregM
 
Posts: 21
Joined: Jan 16th, '18, 01:34

Re: System boots with Read Only file systems

Postby doktor5000 » Jun 12th, '18, 18:19

gregM wrote:Now I just have the issue of not all partitions being mounted at boot, /home and /data dont auto mount for some reason, but will mount happily with 'mount -a'

I added the 'defaults' option to the fstab for those 2 slices not mounted, but no change.
Here is the edited fstab, that file has been unchanges for many months, so I doubt it is the cause of that particular issue.

You would need to look again at journalctl -ab output after your changes. What happens essentially is that systemd parses the fstab and creates on-the-fly mount units for every mountpoint for local-fs.target, maybe there's an issue with that.
Also see the man pages for systemd.special and systemd-fstab-generator
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 14468
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: System boots with Read Only file systems

Postby wintpe » Jun 13th, '18, 10:46

one other hint, maybe nothing to do with you errors.

but I had bad blocks being reported on one of my SSD's so thought it was failing.

Turns out it the motherboard had slightly overclocked the CPU, last time i made some changes in the bios and this was the cause.

unlikely thats your problem, but check in case.

regards peter
Redhat 6 Certified Engineer (RHCE)
Sometimes my posts will sound short, or snappy, however its realy not my intention to offend, so accept my apologies in advance.
wintpe
 
Posts: 1168
Joined: May 22nd, '11, 17:08
Location: Rayleigh,, Essex , UK

Re: System boots with Read Only file systems

Postby doktor5000 » Jun 14th, '18, 17:56

On a related note, when asking our grub maintainer about this "ro" options seems that is the default and by itself not an issue, it's always present in a default installation.

But he observed similar problems when one of his SATA controllers was dying, filesystems initially mounted fine but after some time they were simply readonly, and no real log entries about the cause, so this is also something you should check out. Try a different controller for that disk, or at least a different port.
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 14468
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: System boots with Read Only file systems

Postby morgano » Jun 15th, '18, 00:30

Also try a different cable (been there years ago on IDE PATA ... )
Possibly also a different power connector (been there too...)
Mandriva since 2006, then Mageia since 2011 at home & work. Thinkpad T40 T42p T43 T60 T61 T400. Aspire 7. Fileserver. Workstation using LVM, LUKS, VirtualBox, BOINC, Dropbox, CAD, urpmi-proxy, NextCloud...
morgano
 
Posts: 538
Joined: Jun 15th, '11, 17:51
Location: Kivik, Sweden

Re: System boots with Read Only file systems

Postby gregM » Jun 15th, '18, 08:03

Thanks for the hints to try different SATA ports and cables.
Another cable another port.
But there was no change to current state. The system boots read write but 2 partitions defined in fstab not mounting untill a mount -a is run.
I added this option to fstab for those partitions not auto mounting at boot, but no effect. x-initrd.mount
I got a bit lost trying to follow the systemd- path suggested and did not see anything revealing in the journal -ab output ..

My new drive has just arrived so going to put a nice new system on to it overnight....
gregM
 
Posts: 21
Joined: Jan 16th, '18, 01:34

Re: System boots with Read Only file systems

Postby morgano » Jun 16th, '18, 12:04

If you have not begun installing yet, an interesting test would be to just image the old drive to the new and see if that boots OK.
Mandriva since 2006, then Mageia since 2011 at home & work. Thinkpad T40 T42p T43 T60 T61 T400. Aspire 7. Fileserver. Workstation using LVM, LUKS, VirtualBox, BOINC, Dropbox, CAD, urpmi-proxy, NextCloud...
morgano
 
Posts: 538
Joined: Jun 15th, '11, 17:51
Location: Kivik, Sweden

Re: System boots with Read Only file systems

Postby gregM » Jun 17th, '18, 04:06

I have resolved the issue by putting a fresh system onto a new disk.
I cant say why or what cased the problem.
My workaround to change the grub.cfg file to not have the 'ro' option set on the kernel image boot setup (which did allow 'rw' file systems), was a hack that is not reflected in a fresh install. (As mentioned by doktor5000's grub maintainer advice)

So somewhere else is the underlying cause ... Possibly a systemd error that did not allow fstab to be read correctly during the boot sequence.

Dont think I can mark the issue [solved] .. At best it has been put to rest, and the cause remains unknown.

I can say a clean system is responding faster than my old 5 or 6 year old legacy install from Mageia4, Mageia5 and Mageia6. There was a lot of stale files laying about, with KDE 3,4,Plasma. The system was built up and upgraded from the original Mageia, after the Mandriva fork from my original Mandrake. So a totally fresh version of Mageia6 is a nice clean start.

Thanks for the suggestions and help.
gregM
 
Posts: 21
Joined: Jan 16th, '18, 01:34

Re: System boots with Read Only file systems

Postby doktor5000 » Jun 17th, '18, 15:16

gregM wrote:So somewhere else is the underlying cause ... Possibly a systemd error that did not allow fstab to be read correctly during the boot sequence.

Dont think I can mark the issue [solved] .. At best it has been put to rest, and the cause remains unknown.

Well, who else should figure out the cause?
Hence my referral to the relevant documentation: viewtopic.php?p=72774#p72774
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 14468
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany


Return to Basic support

Who is online

Users browsing this forum: No registered users and 1 guest