Page 1 of 1

Upgraded Mageia 1 to 2 - Boot volume no longer exists

PostPosted: Jan 16th, '13, 19:56
by Will94
Hello,

I work for an academic department at a large university. We have four high-end workstations used by our researchers for crunching large data sets. These were originally Mandriva 2010 machines that I had upgraded to Mageia 1 last summer. As Mageia 1 has reached EOL, I upgraded three of these machines to Mageia 2 this morning.

These three machines no longer boot. The GRUB boot menu loads fine. I have a small Windows partition on each machine, and I can still boot to it. However, if I attempt to boot to Mageia, after about 20 seconds, I receive the error:

dracut warning: Cancelling resume operation. Device not found
dracut warning: "/dev/mapper/isw-dhhcahgjca_CookieMonsterp6" does not exist


'CookieMonster' is the name of the boot volume. These machines have mirrored SATA hard drives controlled by an Intel Matrix Storage controller.

I went through the upgrade process on a couple of test machines, and it went well both times. Neither of those machines had any kind of RAID setup, so I think that the Intel controller is the problem.

I would greatly appreciate any help as these are production machines.


Thank you!

Will

EDIT: I upgraded these machines using the following commands:
Code: Select all
su -
urpmi.removemedia -a
urpmi.addmedia --distrib --mirrorlist http://mirrors.mageia.org/api/mageia.2.x86_64.list
urpmi --replacefiles --auto-update --auto

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 16th, '13, 23:10
by doktor5000
Failsafe mode or the old kernel does not work anymore? Have you tried editing the grub/kernel options and removing the resume= option as per your error message?
Apart from that best bet is to boot via a livecd and compare fstab entries with the real devices. Unfortuantely im not into LVM under linux and can't help you troubleshoot that :/

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 16th, '13, 23:40
by Will94
Failsafe mode or the old kernel does not work anymore?

No, failsafe mode gives the same error, and I had already run 'urpme --auto-orphans', which I think removes the old kernel. Trying to boot to the old kernel gave numerous errors about missing files.

Have you tried editing the grub/kernel options and removing the resume= option as per your error message?

I tried that just now. It didn't seem to make a difference. I received a very similar error although it no longer mentions the resume operation

dracut Warning: Unable to process initqueue
dracut warning: "/dev/mapper/isw-dhhcahgjca_CookieMonsterp6" does not exist
dracut warning: "/dev/mapper/isw-dhhcahgjca_CookieMonsterp6" does not exist

Dropping to debug shell.


I will look at the fstab entries next.


Thank you,

Will

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 17th, '13, 03:48
by Will94
When I booted into the live Knoppix disk, I looked to see if there was any relevant information in /etc/fstab. There wasn't, but I had to manually mount my hard drive, so that wasn't surprising to me.

I have a PC at home which has a similar Intel Array Controller and dual boots Windows/Mageia 1. I upgraded it to Mageia 2, and now have the same problem at home that I do at my office. I am certain that it's the controller, but I don't know what to do about it.

I looked at the menu.lst file for the one machine that is still on Mageia 1, and it is loading the OS with the same 'dev' information as the three Mageia 2 machines that won't boot.

:?:

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 17th, '13, 07:37
by doktor5000
Please file a bug report as a regression: https://wiki.mageia.org/en/How_to_report_a_bug_properly
For completeness's sake and also to allow others to follow up on that bug report, please also post the link to it here in the thread, thanks.

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 17th, '13, 11:49
by filip
Will94 wrote:
Failsafe mode or the old kernel does not work anymore?

No, failsafe mode gives the same error, and I had already run 'urpme --auto-orphans', which I think removes the old kernel. Trying to boot to the old kernel gave numerous errors about missing files.

This famous 'urpme --auto-orphans' should be used with special care as it removes needed packages sometimes. I recall some bugs on that topic too. I just use it as a guide and not to actually remove packages.

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 17th, '13, 14:25
by nigelc
Hello,
I remember reading somewhere that you can't boot from a lvm partition.
Can you make the /boot an ext4 partition or something else.

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 17th, '13, 14:41
by gohlip
nigelc wrote:Hello,
I remember reading somewhere that you can't boot from a lvm partition.
Can you make the /boot an ext4 partition or something else.


I think you mean making /boot outside of LVM and for good measure, if using grub-legacy, make it ext2.
Also it's good to check menu.lst (or grub.cfg) and fstab for possible wrong path of partitions.

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 17th, '13, 18:19
by Will94
Please file a bug report as a regression: https://wiki.mageia.org/en/How_to_report_a_bug_properly
For completeness's sake and also to allow others to follow up on that bug report, please also post the link to it here in the thread, thanks.

I started working on a bug report. I don't have much experience with this and have a couple of questions.

I don't know what to put as the source rpm. An upgrade replaces many if not most rpms.

Also, I don't know what to put for the severity. In a way it's critical because it killed my systems, and this Intel controller has been shipping with Dell Optiplexes for at least the last five years. On the other hand, the problem is apparently uncommon as I haven't been able to find any information about it through google.

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 18th, '13, 00:04
by doktor5000
Will94 wrote:I don't know what to put as the source rpm. An upgrade replaces many if not most rpms.

Also, I don't know what to put for the severity. In a way it's critical because it killed my systems, and this Intel controller has been shipping with Dell Optiplexes for at least the last five years. On the other hand, the problem is apparently uncommon as I haven't been able to find any information about it through google.


Source rpm is no mandatory field, if you don't know or are unsure which package cuases it leave it out.
For the severity, either major or critical, i'd say the latter: https://bugs.mageia.org/page.cgi?id=fie ... g_severity

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 23rd, '13, 00:48
by Will94
I spent a ton of time on this problem last week and am no longer certain that it should be considered a bug.

I tried to install Debian 6 "stable" to this machine and ran into a similar problem. The partition utility would show three devices: a RAID device and two SATA hard drives. It was essentially "too smart" for it's own good. For example, Windows only shows a single hard drive, which is what I wanted Mageia "fooled into" seeing. I tried to create a swap and ext4 partition on the RAID device and let it push to the two hard drives, but I couldn't write any partitions to the RAID device. After a lot more searching with Google, I came across someone else with my issue who had solved it by adding the command 'dmraid=on' at installation. After this command was added, I could write my partitions in Debian, but after the installation completed, I couldn't write a boot sector to the RAID or hard drive devices. I tried with both GRUB and LILO and couldn't write a boot sector with either of them. The "dmraid=on" switch didn't help with Mageia at all.

I decided to install Mageia 2 on one of the drives and lose my redundancy. However, even after I disabled one of the drives in the BIOS, the Mageia partition utility still saw it. I had to open the case and pull the power to the drive and remove the mirror in the Intel controller's BIOS before I could write a partition table with the Mageia installer. The machine is now running fine off of one hard drive.

If the machine wasn't so old, I would buy a hardware RAID controller for it. Apparently the software controller no longer works with any flavor of Linux... at least not Mageia, Debian, or openSUSE (which I tried at home). Should I just let this go without report?


Thank you,

Will N.

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 23rd, '13, 04:06
by nigelc
Hello,
Is this a software raid done though the bios?

Maybe the newer kernels do not support it.
The Winfast/Amd system I have has the some sort of thing. I have never tried to run it as a raid controller.

Nigel

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 23rd, '13, 04:17
by Will94
Here is the wikipedia entry for it. It's a "firmware RAID" system. It works fine in Windows and used to work fine with Linux. Maybe a firmware update would help?!? If I'd realized when it was when I bought the machines, I'd have bought a hardware RAID adapter for each one.

http://en.wikipedia.org/wiki/Intel_Rapid_Storage_Technology

Re: Upgraded from Mageia 1 to 2 - Boot volume no longer exis

PostPosted: Jan 23rd, '13, 07:04
by nigelc
Hi,
Have you tried installing mdadm ?
It's in the the Mageia repos
https://en.wikipedia.org/wiki/Mdadm

Re: Upgraded Mageia 1 to 2 - Boot volume no longer exists

PostPosted: Jan 26th, '13, 05:01
by ronparent
You are right. It is a software RAID (also commonly called fakeRAID) utilizing a pseudo raid bios on the motherboard to initially configure it. In Windows this RAID is accessed through a driver to enable the partitions on the raid drive to be read and written to. In linux this is done using dmraid. Thus if dmraid is installed on the booted linux OS the dmraidon switch at boot will activate the RAID and map the partitions using the /dev/mapper symbolic links (and there are others) set up at boot. The main advantage of this type RAID is that when a system includes both Windows and a linux distro the Win files on the RAID can be read from linux.

So when you don't find the RAID partitions on boot it appears that dmraid has not been installed. Installing dmraid to a live CD session would automatically activate those partitions for the duration of that session. Similarly doing a chroot install of dmraid to your RAID Linux install during that same session should fix that install so that it will boot.

Don't use mdadm because it sets up a completely different set of symbolic links which may or may not work but with unpredictable results!

Incidentally, when dmraid has activated the RAID, the partitioner will show it as a single drive. The partitioner will also separately show the component disks as unpartitioned space. Do not make partitioning changes to a component drive as that would destroy the integrity of the RAID (especially if it is a RAID0).