[SOLVED] Excessive logs, filesystem full, fsck differences

This forum is dedicated to advanced help and support :

Ask here your questions about advanced usage of Mageia. For example you may post here all your questions about network and automated installs, complex server configurations, kernel tuning, creating your own Mageia mirrors, and all tasks likely to be touchy even for skilled users.

[SOLVED] Excessive logs, filesystem full, fsck differences

Postby morgano » May 5th, '13, 13:54

EDIT, clearing things up
I had two problems:

1) root partition filled up because of excessive error messages caused by a USB thing that did not have its storage (microSD) inserted.
Upstream, assigned: https://bugzilla.kernel.org/show_bug.cgi?id=43191
I posted a bug so we keep track: https://bugs.mageia.org/show_bug.cgi?id=10038

2) For some reason differnet versions fsck said different about my filesystem.
Only the own system fsck found problems, a USB live system fixed the rest, and after deleting logs the system ran fine.
I have no energy to track that filesystem problem further.

/EDIT


This is my workstation (see sig)
It is always on, busy with torrenting mageia and processing for BOINC, when it dont spend a few cycles with me.
It is paused or rebooted maybe every other month or so for cleaning dust or other maintenance, or i just want quiet and peace ;)
I have been wondering how long a filesystem on a CCD may last as it is often writing, and how an error will express itself.
Maybe this is it, but i do not get any wiser, unfortunately.

Symptom: Yesterday a popup said / was near full. df and gkrellm said it was used 100%. actually df said size 17GB used 16GB 100%, rounding error?
It was a few GB left last i checked. Strange. I could not find what files caused it. I cleaned some hundred MB.
Now this morning same message, and i cleaned another 500 MB. Still it measures 100% full, thus even crash shutting down.
Tried to force file system check next boot by # touch /forcefsck, ls said file got written, but it was ignored on boot.
I have also earlier noted it do not work, maybe because / is a partition in LVM on LUKS.

So i booted on sysresccd, made a disk image to external disk for backup, and manually unlocked and activated my /.
However... e2fsck say my / is clean. even -fv do not help. True for all filesystems.
All filesystems are ext4. No separate /home, but instead a separate filesystem with most user data dirs softlinked to dirs in ~.

So now i scratch my head while making backup of all user filesystems from that drive, then reinstall.
It would feel much better if i could understand what is wrong.

Well, this is a mga 1 and i was just waiting for mga3 final before reinstalling, so that is maybe what i will do today.
But i wonder if i should use that drive any more. I was planning to buy a new drive before installation of mga3...

Trying to figure out how i can read out as much status info as possible from that drive... have a look at S.M.A.R.T.
Also maybe having a look in system logs would be interesting
Maybe also see if there is newer firmware i can put into it. Bah!
I really must have it working tomorrow.

But that stupid damn silicon shit ain't gonna put me down!

Right now i will go out in the sunshine with my family and have fun :)
Last edited by morgano on May 9th, '13, 14:55, edited 2 times in total.
At home & work Mandriva since 2006, Mageia 2011. Thinkpad T40, T43, T60, T400, T510, Dell M4400, M6300, Acer Aspire 7. Workstation using LVM, LUKS, VirtualBox, BOINC
morgano
 
Posts: 1475
Joined: Jun 15th, '11, 17:51
Location: Kivik, Sweden

Re: Filesystem broken but exfsck finds it OK...

Postby doktor5000 » May 5th, '13, 15:21

Well, you're throwing too much into one pot, need to puzzle it together.

For the used space, opened files will show in df, but not in du. So there may be a difference.
Best bet is to check with
Code: Select all
du -mx --max-depth=2
or with something like baobab if you don't know where that used space is.
What i'll do mostly is
Code: Select all
du -mx / |sort -rn | head -75
will show the 75 biggest directories/files, and stay within the / filesystem.

For the file system check, why don't you use Mageia safe mode for that? Then do an fsck -fnv /dev/sdX,
does it still say that FS is clean? If it says it's clean, why do you think it's broken at all?
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18018
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: Filesystem broken, newer fsck finds it OK, but not origi

Postby morgano » May 5th, '13, 20:33

Thank you for th equick reply and tips.
doktor5000 wrote: For the used space, opened files will show in df, but not in du. So there may be a difference.

OK, but i removed 500+ megabytes and rebooted.
(Actually had to power off since KDE hang half way down, and keyboard got completely unresponsive, not even power button short press convinced system to shut down)
Next boot X failed, complaining disk is full.
Then i decided sometning is definately wrong.

For the file system check, why don't you use Mageia safe mode for that?

I wanted to back up before risking wrecking it by tool problem or own mistake :) Now i am ready:

Before kde got unusable and now can not start, I was hunting disk use using fsview which is the one i like most.
baobab is not installed. Now called File system analyzer, part of gnome? Anyways:

Plain # du at / in safe mode give 71583157
# du -mx / | sort -rn | head -75 give 15924 for /, 5838 for /usr, 5707 for /home, 2827 for /var, and then lower for others
# df report for / (/dev/mapper/vg0-Mageiaroot) 1k blocks 17063352 total, 16464672 used, 100%
Thank you for putting me on the right track here:
# fsck -fnv /dev/mapper/vg0-Mageiaroot
It responds ***** WARNING: Filesystem stil has errors *******
After that there is too much to write here manually. I can not see a specific error, but i note it say 1 large file.

That is very strange, because when i boot on sysresccd, unlock LUKS, activate the lv and scan it using the exact same command line, it say it is clean!

The broken systems fsck say it is from util-linux-ng 2.18, while sysresccd fsck say util-linux 2.22.2. And of course a newer kernel and others...

To see internal status of the disk, I tried the boot disk MHDD on sysresccd but it fail acessing. I will look at Corsair for their own tool if available.

The right thing to do now is to try if mga2 liveCD can see the error, and fix it. (about same age as latest updates of this mga1) If not, try mga1 or 3 liveCD.
But i have already decided to format the drive in an attempt to get rid of any problematic residue, and I am lazy so i just bootes safe mode again, and # fsck -fv /dev/mapper/vg0-Mageiaroot . It found about 5 total problams fixed, rebooted. Still problem, ran fsck again, 2 problems. Hmmmm. And still 100% used. Maybe it really hurts to have it working on a mounted filesystem even in single user mode... OK, I will try a mga live CD. Stand by....

Another idea i have is that maybe there is a size mismatch between filesystem and partition if something went wrong when i let diskdrake grow it some month ago - while running on it. (I had backup, yes) But it have been working nicely long time since then until now.
At home & work Mandriva since 2006, Mageia 2011. Thinkpad T40, T43, T60, T400, T510, Dell M4400, M6300, Acer Aspire 7. Workstation using LVM, LUKS, VirtualBox, BOINC
morgano
 
Posts: 1475
Joined: Jun 15th, '11, 17:51
Location: Kivik, Sweden

Re: Filesystem broken but exfsck finds it OK...

Postby morgano » May 5th, '13, 23:13

Attaching the drive to a mga2 KDE machine i ran fsck (from util-linux 2.21.1) and it fixed a few problems.
And using fsview i spot:
/var/log/syslog and messages both 825 MB each, kernel/errors.log 408MB, warnings 413MB
Interesting relationship!

syslog and messages both contain the two lines repeated more than 200 copies per second:
May 4 13:24:57 svarten kernel: sd 21:0:0:0: [sdb] Assuming drive cache: write through
May 4 13:24:57 svarten kernel: sd 21:0:0:0: [sdb] Test WP failed, assume Write Enabled

In /var/log/kernel/errors.log "only" the lines as this occour:
May 4 13:23:59 svarten kernel: sd 21:0:0:0: [sdb] Assuming drive cache: write through

And in /var/log/kernel/errors.log "only" the lines as this occour:
May 4 13:24:10 svarten kernel: sd 21:0:0:0: [sdb] Test WP failed, assume Write Enabled

sdb? Oh, the cheap mini video camera i bought two days ago and plugged in for charging! I have not yet equipped it with the SD card as it was out of stock. And apparently the kernel stumbles trying to connect that "USB drive" more than 200 times per second and never gives up. Doh.

What a nice time delay system bomb. Just plug it in a USB socket and the machine goes down some hours later!


So now i found what triggered my problems.

I simply deleted the four files. They were the four biggest files in that filesystem.
Strangely fsck both before and after reports "1 large file". That does not feel reliable...

At last, df say that filesystem now have 2,2 GB free.
And the system boots and runs OK! Hurray.

Strange that newer versions of fsck did not find the problem the old version saw.
I will update the SSD firmware and format it anyways, just to be sure.
And try read drive internal statuses. And install mga3 fresh, and use the backed up user data.
At home & work Mandriva since 2006, Mageia 2011. Thinkpad T40, T43, T60, T400, T510, Dell M4400, M6300, Acer Aspire 7. Workstation using LVM, LUKS, VirtualBox, BOINC
morgano
 
Posts: 1475
Joined: Jun 15th, '11, 17:51
Location: Kivik, Sweden

Re: Filesystem broken but exfsck finds it OK...

Postby doktor5000 » May 6th, '13, 23:01

Please report that as a bug and provide the usb/pci id of that camera.
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18018
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: [SOLVED] Edit: Filesystem full, different fsck say diffe

Postby morgano » May 7th, '13, 00:33

I will, when i can report also how it behaves with microSD installed. And tested on mga2 and 3. A few days.
At home & work Mandriva since 2006, Mageia 2011. Thinkpad T40, T43, T60, T400, T510, Dell M4400, M6300, Acer Aspire 7. Workstation using LVM, LUKS, VirtualBox, BOINC
morgano
 
Posts: 1475
Joined: Jun 15th, '11, 17:51
Location: Kivik, Sweden

Re: [SOLVED] Edit: Filesystem full, different fsck say diffe

Postby morgano » May 9th, '13, 14:49

At home & work Mandriva since 2006, Mageia 2011. Thinkpad T40, T43, T60, T400, T510, Dell M4400, M6300, Acer Aspire 7. Workstation using LVM, LUKS, VirtualBox, BOINC
morgano
 
Posts: 1475
Joined: Jun 15th, '11, 17:51
Location: Kivik, Sweden


Return to Advanced support

Who is online

Users browsing this forum: No registered users and 1 guest

cron