[HACK WORKS]Why does "nokmsboot" disappear during MGA6 boot?

This forum is dedicated to testing early releases and cauldron : Howtos, tips, tricks and user global feedback and thoughts...

Helpful tip :
For bugs tracking we use : https://bugs.mageia.org = The Mageia Bug Tracker
In this bug tracker you'll find already reported bugs and you'll be able to report those you have found....

[HACK WORKS]Why does "nokmsboot" disappear during MGA6 boot?

Postby jaywalker » Jan 27th, '17, 02:56

...and when we figure that one out we can move on to my follow-up question which is
Why do I need the
Code: Select all
 ...nokmsboot...
option to boot a machine with AMD graphics?


I think I have just had a flash of inspiration which may answer the second question. This computer uses the AMD A10-5800 with integrated 7760 graphics, but a couple of months ago I added an Nvidia card only for Blender to use for accelerated CUDA rendering.

I think that, despite telling the MGA6 installer to configure ONLY the Radeon device for the displays, systemd or some other loose code in the early boot is detecting the presence of the Nvidia card and jumping to the conclusion that it can be used for the display. That would explain the two monitors going "dead" though it does leave room to consider why the box's reset button and the keyboard are also disabled.

That brings us back to the main question. The nokmsboot option on the grub command line is not permanent. When I boot successfully, after adding the option the default entry in /boot/grub/menu.lst, I discover that the option has been removed! If I don't replace it then the next boot will fail within seconds and I have to use the failsafe boot menu option (where the nokmsboot option appears to be a survivor) , or F3 Defaults and add the nokmsboot option to the boot command.

It is probably not some stray malware doing this, though whether it is systemd or other loose code is unclear. It certainly seems to be smarter than me. I tried to prevent it by setting menu.lst to read-only, but the offending code seems to be able to work around that.

I suppose there is another question which seriously needs attention; what possible good reason could there be for some piece of poorly written code to mess with MY boot command while it is booting?
Last edited by jaywalker on Feb 5th, '17, 19:29, edited 1 time in total.
jaywalker
 
Posts: 341
Joined: Nov 17th, '11, 02:38
Location: Belfast, Northern Ireland

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby gohlip » Jan 27th, '17, 10:46

I'm on mga6, and it boots without 'nokmsboot'. I have radeon.
This bug https://bugs.mageia.org/show_bug.cgi?id=8540 should have (long ago) been fixed.
Yes, I do always test on any Mageia version if this nokmsboot is required. viewtopic.php?f=7&t=11334#p66033
It seems to change with each alternate Mageia version (and it shouldn't).

Mageia developers don't take action on this forums (they should - or at least read forum) and people here will ask you to write bug reports.
Why do we live? To prove not everything in nature has a purpose.
gohlip
 
Posts: 573
Joined: Jul 9th, '12, 10:50

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby gohlip » Jan 28th, '17, 10:01

To go back to the OP issue at hand,


The nokmsboot option on the grub command line is not permanent... I discover that the option has been removed! .... and I have to use the failsafe boot

I suspect there is something wrong in your assumption.
menu.lst (and grub.cfg) will not change (if saved and no 'update-grub' done) unless these changes are done manually (at grub menu, press 'e'...) or the 'booting' grub menu is not from the OS itself (other than the OS where changes is done).

Can you recheck each of these cases because that would be highly unusual. If any of these happens, let us know and we'll try to find a way to get you to boot correctly again.


Slightly off topic, but because you asked, not by the rest of us.
I have never before encountered a case of some third party overwriting MY bootloader

All newly installed OS will override YOUR bootloader with theirs. and will remain unchanged even if you have say, kernel changes in other OS (and you need to 'update-grub' in the 'default' bootloader OS)
So to repeat, recheck if Mageia is actually the 'default' bootloader for your system.

ps: systemd has nothing to do with bootloaders (unless you use systemd boot (bootctl) and won't work with msdos/bios-legacy anyway).

ps:
I tried to prevent it by setting menu.lst to read-only,

How? What did you do? Is this preventing your own changes (nokmsboot) taking place? Please explain.
Why do we live? To prove not everything in nature has a purpose.
gohlip
 
Posts: 573
Joined: Jul 9th, '12, 10:50

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby jaywalker » Jan 29th, '17, 02:28

Hey guys, do you wanna take this outside?
jaywalker
 
Posts: 341
Joined: Nov 17th, '11, 02:38
Location: Belfast, Northern Ireland

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby doktor5000 » Jan 29th, '17, 12:34

jaywalker wrote:Hey guys, do you wanna take this outside?


FWIW, I've split out the posts about bugreports into a separate thread viewtopic.php?f=18&t=11587
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18042
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby jaywalker » Jan 29th, '17, 18:35

gohlip wrote:To go back to the OP issue at hand,


The nokmsboot option on the grub command line is not permanent... I discover that the option has been removed! .... and I have to use the failsafe boot

I suspect there is something wrong in your assumption.
menu.lst (and grub.cfg) will not change (if saved and no 'update-grub' done) unless these changes are done manually (at grub menu, press 'e'...) or the 'booting' grub menu is not from the OS itself (other than the OS where changes is done).

Can you recheck each of these cases because that would be highly unusual. If any of these happens, let us know and we'll try to find a way to get you to boot correctly again.


I change menu.lst to re-insert the nokmsboot option using mc in a root shell in MGA5 (which boots from the internal sole internal disc on this machine). I select which OS to boot by using the motherboard's F8 - boot select option which allows me to select the disc from which to boot. I take note of the timestamp on the corrected menu.lst file and re-boot the machine.

While the motherboard is sorting itself out I tap the F8 key and on appearance of the drive list I select the external USB device containing only MGA6. When the MGA6 grub menu appears I let it time-out to boot to the default desktop, which it happily does.

On my desktop I start a root console and examine the menu.lst file. I could also start MCC and use the boot loader section to do the same thing., but I prefer to use mc so that I can verify that the time-stamp has changed and re-insert my nokmsboot option at the same time.

On checking the journal I will also see that the new time-stamp is very close to the entries at the start of the logged boot process, implying that some action not directed by me has been taken during the boot to alter my bootloader menu.

gohlip wrote:Slightly off topic, but because you asked, not by the rest of us.
I have never before encountered a case of some third party overwriting MY bootloader

All newly installed OS will override YOUR bootloader with theirs. and will remain unchanged even if you have say, kernel changes in other OS (and you need to 'update-grub' in the 'default' bootloader OS)
So to repeat, recheck if Mageia is actually the 'default' bootloader for your system.


I agree with your observation that installing an OS will override a bootloader if installed to the same disc. However it has been my repeated experience that if the new OS is confined to its own drive and shares no partitions with any other system on any other drive then it will leave any other bootloader on any disc other than its own completely alone. The only exception to this rule that I am aware of is if, during the installation process, I elect to install the new OS bootloader in place of any existing bootloader on the PC's default boot drive.

gohlip wrote:ps: systemd has nothing to do with bootloaders (unless you use systemd boot (bootctl) and won't work with msdos/bios-legacy anyway).


I am greatly relieved to hear that systemd has not yet found a way to interfere with the operation of the bootloader, but perhaps it is headed that way. Assuming that the whole boot process is controlled by systemd once the bootloader has done its job, then something under the control of systemd is likely poking around and fiddling where it shouldn't.

gohlip wrote:ps:
I tried to prevent it by setting menu.lst to read-only,

How? What did you do? Is this preventing your own changes (nokmsboot) taking place? Please explain.


Finally an easy question to answer :~) I simply reset the write bit in the permissions of menu.lst, but it didn't do the trick.

In preparing the detailed info for this response which you requested I think I have found the culprit. In particular I paid closer attention to the various time stamps and when the observed change occurs in the long list of journal events.

The current time stamp for menu.lst gives 14:54, but the output from
Code: Select all
stat /boot/grub/menu.lst
adds extra precision, giving 14:54:49.765230526.

The journal's boot logging starts at 14:53:57 and ends at 14:55:30 or thereabouts, so the change to my command line happened about 68 seconds into the boot process and some 40 seconds before it completed.

[drum-roll] And what do we find in the journal at 14:54:49?[/drum-roll]
Code: Select all
Jan 29 14:54:49 caldera.local service_harddrake[839]: perImageAppend is now splash quiet noiswmd nokmsboot iommu=pt resume=UUID=28cf41b3-7f00-4424-9393-23a72cd3ba5e
Jan 29 14:54:49 caldera.local service_harddrake[839]: running: /sbin/display_driver_helper --is-kms-allowed
Jan 29 14:54:49 caldera.local service_harddrake[839]: modify_append: splash quiet noiswmd iommu=pt resume=UUID=28cf41b3-7f00-4424-9393-23a72cd3ba5e
Jan 29 14:54:49 caldera.local service_harddrake[839]: modify_append: splash quiet noiswmd iommu=pt resume=UUID=28cf41b3-7f00-4424-9393-23a72cd3ba5e
Jan 29 14:54:49 caldera.local service_harddrake[839]: modify_append: resume=UUID=28cf41b3-7f00-4424-9393-23a72cd3ba5e
Jan 29 14:54:49 caldera.local service_harddrake[839]: modify_append: splash quiet noiswmd iommu=pt resume=UUID=28cf41b3-7f00-4424-9393-23a72cd3ba5e
Jan 29 14:54:49 caldera.local service_harddrake[839]: moved file /boot/grub/device.map to /boot/grub/device.map.old
Jan 29 14:54:49 caldera.local service_harddrake[839]: created file /boot/grub/device.map
Jan 29 14:54:49 caldera.local service_harddrake[839]: writing grub config to /boot/grub/menu.lst


A quick check of MGA5 reveals that there has been no material change to /sbin/display_driver_helper which is revealing as this hardware boots perfectly well with MGA5. There is, however, one possibly significant difference between the modules loaded for MGA5 and those available for MGA6 - the MGA5 boot can make use of the fglrx driver for the display, but MGA6 must use radeon.

I conclude that the nokmsboot is deemed unnecessary as the display driver is currently radeon and thus my nanny quietly fixes that for me - not even a slap on the wrist! However, what my nanny doesn't know (how could she?) is that removing that option will cause the most complete boot crash I have ever seen.

I reckon I have two options:

(1) go on patching my menu.lst entry after each successful boot

(2) hack my installed copy of display_driver_helper --is-kms-allowed so it always says NO.

...and wait for fglrx to fix this.

What do you think?

Richard
jaywalker
 
Posts: 341
Joined: Nov 17th, '11, 02:38
Location: Belfast, Northern Ireland

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby gohlip » Jan 29th, '17, 21:05

Hi Richard, took me a while to read (and reread many times) your post. :D
I was trying to make some sense what you're doing. LOL.
....using mc in a root shell...
Midnight Commander? wow.

You're obviously a 'seasoned veteran' (I mean it in a good way) but instead of going through all your 'points', I thought I'd suggest a procedure if that is okay with you. But one point that you made needs to be highlighted.

When the MGA6 grub menu appears I let it time-out to boot to the default desktop, which it happily does.
But that would mean the changes you apply to mga5 menu.lst will not kick in.

For the sake of simplicity, let's boot first without mga6 external connected to the system (we can put that in later on - but please let us know if that is grub 2 or grub -legacy).
And we'll use only mga5 grub menu.lst to boot (if there are other linux OS in your system, let us know).
Always try with both 'nomksboot' and without. Tell us what works for you. (We won't discuss whether it's 'right' at this point but we will use what works for you)

Please do that and come back to us.
Good luck.

ps; It's late here, it may be several hours before I respond.
ps: is it okay to use 'gedit' or 'kate' to amend your menu.lst? Not 'mc'. Thanks.
ps; there are several other points I really like to discuss, really. Like what causes 'default' boot, installing to partitions, not devices; install same bootloader to several devices (I do that). but we do have a more pressing issue to take care of now. We can do that after we settle the issue, if you like.
Why do we live? To prove not everything in nature has a purpose.
gohlip
 
Posts: 573
Joined: Jul 9th, '12, 10:50

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby jaywalker » Jan 30th, '17, 02:54

gohlip wrote:Midnight Commander? wow.


1st confession; That's all I know ! I am completely at sea with vi or emacs, or any other console editor I have seen. To be horribly honest, I still haven't quite got over not being able to run "The Editor" from my Sinclair QL days.

gohlip wrote:You're obviously a 'seasoned veteran' (I mean it in a good way)[...]


"Veteran" perhaps, but hardly "seasoned". First hit the ground running with Mandrake 7 but I am still struggling to catch up.

gohlip wrote:[...]but instead of going through all your 'points', I thought I'd suggest a procedure if that is okay with you. But one point that you made needs to be highlighted.

When the MGA6 grub menu appears I let it time-out to boot to the default desktop, which it happily does.
But that would mean the changes you apply to mga5 menu.lst will not kick in.


Dammit, I knew I hadn't written enough! My way always seems so obvious to me that I often forget others won't necessarily see it that way. Here goes again with bullet points!

*1* Don't mess with the working (MGA5) system's boot loader
*2* Keep MGA5, MGA3 and MGA6 systems completely independent of each other
*3* Avoid automagickerey at all costs - any bootloader changes needed should be done after everything works and be done by me - not an untested wizard.
*4* Stick to a common boot loader - that means grub (legacy) for now

We haven't yet got to the point where I can boot MGA6 from the MGA5 boot menu. I can only boot MGA6 by using the BIOS boot device menu where I choose which drive, hence which bootloader to use. Basically there are no changes to the MGA5 bootloader. All of this has been about getting the MGA6 bootloader to work on its own disc with no reference to other drives or OS.

gohlip wrote:For the sake of simplicity, let's boot first without mga6 external connected to the system (we can put that in later on - but please let us know if that is grub 2 or grub -legacy).
And we'll use only mga5 grub menu.lst to boot (if there are other linux OS in your system, let us know).
Always try with both 'nomksboot' and without. Tell us what works for you. (We won't discuss whether it's 'right' at this point but we will use what works for you)


System "layout"
1 box, 3 OS versions in 4 instances (I have a couple of MGA5s to choose from).
3 bootable drives; one internal SSD and two external USB spinning rust, one of which is a recent addition to make it possible to test Cauldron (MGA6).

The box has an ASUS F2A65-M LE motherboard which has an AMD A10-5800K and integrated 7660 graphics (driving two heads) backed by 16GiB DDR3 1600MHz RAM.

The Radeon 7660 gpu has sole responsibility for the two (occasionally three) attached monitors for all OS instances, despite having an external Nvidia GeForce GTX 960 connected which one of the MGA5 installations uses to accelerate 3D rendering in Blender.

All OS instances (MGA3 and MGA5) use the fglrx driver for 3D and all use the nokmsboot command line option. Sort of following your suggestion I have just re-booted my default MGA5 with the nokmsboot temporarily removed from the command line and the boot fails. No surprise. What did surprise me was that the crash was identical in nature and effects to what I have been experiencing with MGA6 - despite me saying I had never seen the like of it before. My only defence is that it must be so long ago that I first installed MGA3 on this machine, moving on to MGA5, that if it ever did fail like this I have forgotten it. More likely though is that the graphics adapter setup program properly put the option in the bootloader for me.

gohlip wrote:ps; It's late here, it may be several hours before I respond.
ps: is it okay to use 'gedit' or 'kate' to amend your menu.lst? Not 'mc'. Thanks.


Aaaarghhhh! See above re: The Editor. Seriously though, since altering menu.lst needs root access I always head straight to the command line, and mc has the only editor I know how to use there. Just occasionally I might do a
Code: Select all
su -c 'leafpad /boot/grub/menu.lst'
but that is too much typing!

gohlip wrote:ps; there are several other points I really like to discuss, really. Like what causes 'default' boot, installing to partitions, not devices; install same bootloader to several devices (I do that). but we do have a more pressing issue to take care of now. We can do that after we settle the issue, if you like.


Mmmm, yes please.

For now I think it is just a matter of understanding if it is the presence of the unused/unconfigured/nouveau-blacklisted Nvidia card which causes the boot without nokmsboot to fail. Omitting the option really ought to be the right thing to do and the crash I get is essentially the same as what you would expect if the fglrx driver were configured. However MGA6 is obliged to use the radeon driver for my A10 7660, so the nokmsboot which I need in order to boot the OS will always be removed by systemd running service_harddrake and thus always crash.....


Richard
jaywalker
 
Posts: 341
Joined: Nov 17th, '11, 02:38
Location: Belfast, Northern Ireland

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby gohlip » Jan 30th, '17, 09:56

[1]
You have 3 disks
3 bootable drives; one internal SSD and two external USB spinning rust,

and that makes any grub-legacy boots unreliably (and that is why I suggested you remove your external mga6 for the moment).
Reason being the bios may change the sda, sdb and sdc upon each boot and grub-legacy uses "root (hdx,y)" (you will not have this problem if using grub2) though it may use uuid in the linux kernel line; grub 2 uses uuid in its 'search' line to map out the correct device.

[2]
I am not familiar with mc (and I don't know why your journal shows a recreation of menu.lst) but I do know that by using gedit, kate, leafpad, kwrite...the changes are kept (without update-grub or grub-install commands).

[3]
I understand your point of 'nokmsboot', but I think the issue is more about grub booting up the wrong device, and so the wrong OS (you have 3 mga's, all using grub-legacy, all looking the same).
And that is why I suggested you use what works for you first, and then we'll talk about this later on after we fix the grub issue (so that it does not complicate the grub problem, which I think is the primary issue).


Now, since I know now you have 3 disks,
o I suggest you change menu.lst in all mga's to have 'nokmsboot' (as you say, it works with) and then making sure the changes remain (How? - Use leafpad as root).
o When booting, at grub menu, check the disks using command 'ls' (small 'L' and 's') at the grub prompt (type 'c' - "grub> ls") and change manually (press 'e') if the "root (hdx,y)" does not correspond to the right device.
o You may have to remove 'nokmsboot' if that does not boot and see that if by removing, it does boot this time round. But check again with 'ls' each time you start the computer (as said, bios may 'change' sda, sdb, sdc each time).


Of course, it is much simpler to connect only one device (so we don't have another variable to worry about).
But, it is entirely up to you on how you want to proceed with it.

ps: I have not used grub-legacy for at least 6 years now, but I think I can still manage to work things out.
I may not be sure of the right commands (press 'e', press 'c'... 'ls') so watch out for my mistakes.
I used to be quite good too in grub-legacy, to assure you. :P

[edit] - don't worry about this, Richard.
It's for the developers (if they read it).
There is no need either for "iommu=pt" or "noiswmd" (written on this sometime somewhere here in the forum)
or putting linuxefi/initrdefi or linux16/initrd16 (just use linux/initrd in all situations without the workarounds)
or using nokmsboot at all - just make the kms work without it.
If it is tougher than I thought, I apologize; but a simple "We cannot do it" will be reassuring.
Don't mean to criticize, just hope to contribute, seriously.
Why do we live? To prove not everything in nature has a purpose.
gohlip
 
Posts: 573
Joined: Jul 9th, '12, 10:50

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby jaywalker » Jan 31st, '17, 04:52

gohlip wrote:[1]
You have 3 disks
3 bootable drives; one internal SSD and two external USB spinning rust,

and that makes any grub-legacy boots unreliably (and that is why I suggested you remove your external mga6 for the moment).
Reason being the bios may change the sda, sdb and sdc upon each boot and grub-legacy uses "root (hdx,y)" (you will not have this problem if using grub2) though it may use uuid in the linux kernel line; grub 2 uses uuid in its 'search' line to map out the correct device.


I understand and have learned to work with the limitations of grub legacy when working with OS installations on removable drives. I am happy to continue working with grub legacy until I have a stable system again, comprising 4 bootable examples of MGA3,5 (2 of these) and 6.

You suggest I remove sdc (containing MGA6) leaving the machine in its normal working configuration, I can confirm that it still boots correctly in this state as none of its original Mageias have been touched in any way - either by accident or design.

All instances of MGA3 AND 5 boot with the nokmsboot option in place, as needed for use of the fglrx display driver

gohlip wrote:[2]
I am not familiar with mc (and I don't know why your journal shows a recreation of menu.lst) but I do know that by using gedit, kate, leafpad, kwrite...the changes are kept (without update-grub or grub-install commands).


Midnight Commander is just a plain ol' text-based file manager with a host of built-in extras such as a very intuitive interface for a screen-based editor. Just select the file in the file manager and hit F4 to edit it, F2 to save it and F10 to get back to the file manager,

The journal shows the creation (and saving to back-up *.old files) of
    device.map
    install.sh
    menu.lst
They are created, or not, by the service_harddrake service invoking
Code: Select all
display_driver_helper --is-kms-allowed
presumably to insert the option if it deems it is required or to remove the option if it disagrees with its presence.

If the check shows that it agrees with the state of the command line it will leave it alone. Otherwise it will re-write the line in menu.lst with what it thinks is correct (and rewrite device.map and install.sh for good measure).

This is clearly the root cause of my original problem (the disappearing nokmsboot) but it doesn't explain why I need it to avoid the big crash.

gohlip wrote:[3]
I understand your point of 'nokmsboot', but I think the issue is more about grub booting up the wrong device, and so the wrong OS (you have 3 mga's, all using grub-legacy, all looking the same).
And that is why I suggested you use what works for you first, and then we'll talk about this later on after we fix the grub issue (so that it does not complicate the grub problem, which I think is the primary issue).


In fact there is little wrong with the MGA6 grub setup, other than that I was obliged to install grub to the root of a new extra drive due to the inability of the sta1 iso's installation process to put it on a partition of an existing drive (there was room for it on sda2 - on the internal drive). The grub problem you refer to may be the one I documented in another thread, from which I understand that many of the installer's partition-related bugs have been fixed, but I will need to wait for a new iso, or do a network install to check that.

The real show-stopper for me is the disappearing nokmsboot option as that kills any attempt to boot MGA6 unless I remember add the option on the grub menu screen, or use the failsafe menu entry. That Is why I spent this evening trying to isolate the cause of the crash.

My first attempt was to remove the sdb USB drive (not needed for booting from sdc) and unplug the Nvidia card. I then booted MGA6 (with nokmsboot) and re-installed the free radeon driver (just in case it might do it differently in the absence of the Nvidia card), verified that the last boot had indeed cleared the nokmsboot option, updated the installation and finally re-booted.

When the BIOS splash screen appeared I tapped F8 to go to the boot device selection menu and chose the MGA6 drive to boot. It booted, as usual, presenting the shiny new pale-blue MGA6 grub legacy menu so I waited for it to time out and proceed with booting the default menu entry. It crashed.

I powered down (to recover) and re-started to disable the IOMMU switch in the BIOS (just in case - besides, I don't need it yet in MGA6) and saved/restarted to repeat the MGA6 boot sequence. This time I also removed the IOMMU options on the command line, and with nokmsboot still absent I removed the silent and splash options too. Executing that lot took me to another crash, but not before I could see that the boot process had filled one screen (very quickly) but all I could see for sure was that the crash happened about 3 seconds into the boot.

gohlip wrote:Now, since I know now you have 3 disks,
o I suggest you change menu.lst in all mga's to have 'nokmsboot' (as you say, it works with) and then making sure the changes remain (How? - Use leafpad as root).
o When booting, at grub menu, check the disks using command 'ls' (small 'L' and 's') at the grub prompt (type 'c' - "grub> ls") and change manually (press 'e') if the "root (hdx,y)" does not correspond to the right device.
o You may have to remove 'nokmsboot' if that does not boot and see that if by removing, it does boot this time round. But check again with 'ls' each time you start the computer (as said, bios may 'change' sda, sdb, sdc each time).


OK, for the first one, all my working grub legacy boots (3 in all) use the nokmsboot option as all of them need it for the fglrx display driver. The changes have always remained because service_harddrake tests for a display driver which needs the option and will only add it if it is needed and NOT already present (MGA5 x2 for sure - I haven't bothered to check MGA3 - later perhaps). Conversely, if it determines the option is present but not needed it will remove it and re-write the three grub files listed earlier.

On your second point, as each of the grub installations shows no symptoms of not being able to find itself on (hd0) I reckon that we can skip that for now. Each grub knows only one (hd0) and is entirely independent of any other drive until the OS takes over and finds everything it needs by UUID or partition label. This is also true of the two installations sharing sdb, though in that case the MGA5 installation is on sdb10 and MGA3 provides the drive bootloader with a chainload entry for (hd0,9)

remember:
    sda is the internal SSD with MGA5 only at sda1
    sdb is the "permanent" USB drive with MGA3 at sdb1 and another MGA5 at sdb10 - cross-referenced by their respective grub legacy bootloaders as (hd0,4) and (hd0,9)
    sdc is the new, occasional USB drive with MGA6 and no reference to any other OS on any other drive

And finally, because it is getting far too late and I am writing too much again, to summarise what happens with and without nokmsboot:

MGA5 (on sda1 and sdb10) both need the option - else crash. This is normal and display_driver_helper gets it right, so no messing with my command line

MGA3 (on sdb5 + master boot record) needs the option and probably crashes without it - as above, there is no messing with my command line, though I will need to check further to confirm why.

MGA6 (on sdc5 + master boot record) needs the option, but shouldn't need it. The display driver is the free ati from Xorg which uses KMS quite happily, and indeed the X server loads it later when needed. However, without the nokmsboot option, which is what display_driver_helper thinks is correct (and so do I), the boot process crashes within seconds of starting.

Here's a question; if booting using the nokmsboot option when it is not required for the configured display driver is safe and booting without it when it is needed is fatal, why should there be a need in the boot process to remove it if it is found and deemed unnecessary? Leaving it alone should be safe so even if display_driver_helper thinks it shouldn't be there the only rational thing to do is leave it alone. Sure, add it in if it might be needed - that would likely do no real harm, but removing it when somebody put it there deliberately is just plain meddling.

Another fourpence worth - I must check the rationale tomorrow - add the option to a system using an Xorg driver - my bet is it won't crash.

Richard
jaywalker
 
Posts: 341
Joined: Nov 17th, '11, 02:38
Location: Belfast, Northern Ireland

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby gohlip » Jan 31st, '17, 07:36

On your second point, as each of the grub installations shows no symptoms of not being able to find itself on (hd0) I reckon that we can skip that for now. Each grub knows only one (hd0) .....

That is precisely my point of argument. Whether grub takes the device as hd0 depends entirely on what the bios tells grub what it is. Not the other way round. And as mentioned (many times) that will change with each boot.

Also, I've changed (many times and also just again to test) grub.cfg (not menu.lst - I don't use grub-legacy) and without 'grub-install' nor 'update-grub' (mkconfig -o ......) and these changes remain even though the changes are spurious.
Grub 2 (and I think grub-l too) will ignore non-applicable and nonsensical parameters (that's why we can add "iommu=pt" or "noiswmd" even if not needed).

So Richard, I would have to 'take leave' at this point and hope others can help you further on it.
If you found a solution, let us know, ok?

Cheers. Good luck.
Why do we live? To prove not everything in nature has a purpose.
gohlip
 
Posts: 573
Joined: Jul 9th, '12, 10:50

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby jaywalker » Jan 31st, '17, 21:29

Thank you very much for your guidance and encouragement, especially for the information about the bugfixes which may help the installation process and for your observations about grub2. Perhaps I will give that a try when I have got to the bottom of the nokmsboot issue on this motherboard,

I have just booted the MGA6 drive on another machine which has an Nvidia chipset and graphics card. So far, so good. I will explore further and see if it sheds any light on the core issue.

Thank you too for helping me to realise that not every motherboard/BIOS behaves as mine do when I use the boot device selection menu. On my machines the use of this boot drive selection method ensures that the (hd0) device seen by grub legacy will always be the selected device, be it sda, sdb or sdc! The BIOS sorts it out. That is why the grub installation on each master boot partition I use always seeks to boot from the (hd0) device it is given.
jaywalker
 
Posts: 341
Joined: Nov 17th, '11, 02:38
Location: Belfast, Northern Ireland

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby gohlip » Jan 31st, '17, 22:41

You're welcome, Richard. It's been a pleasure for me too to have this dialogue with you.

I understood you well in all your posts and points including this one.

On my machines the use of this boot drive selection method ensures that the (hd0) device seen by grub legacy will always be the selected device, be it sda, sdb or sdc! The BIOS sorts it out.

And yes, bios will normally use the booting device as sda or (hd0) and that will be used for grub (not applicable for sdb and sdc though). I understood this completely.

However... (aha) when booted up to OS, check if that OS remains as sda (hd0). That is where the assumption falls apart. You can do that, among others, by a terminal command
Code: Select all
findmnt /



For example, in my Mageia, - I too have many bootloaders in my 4 disks (complicated to explain - I use my own grub and albeit all in grub 2 (no, I also have systemd boot and never mind...) but rest assured I used my grub2 bootloader in the Mageia device (where bios before boot shows (hd0) - (echo $root)) and when booted up in Mageia,, the terminal in Mageia shows this.
Code: Select all
[pop@Two ~]$ findmnt /
TARGET SOURCE    FSTYPE OPTIONS
/      /dev/sdb2 ext4   rw,relatime,data=ordered
[pop@Two ~]$

[drum-roll] It is sdb, not sda. [Tada...]

So...... using your method of booting {ensuring (hd0) as boot device and booting to its OS in this device} does not ensure that the partition remains (hd0,x) and your 'grub-install /dev/sda' or update-grub (grub-mkconfig -o ...) will have your bootloader installed to the subsequent (hd0) which is no longer the device where your OS resides in. Hope this is clear.

Cheers.
Why do we live? To prove not everything in nature has a purpose.
gohlip
 
Posts: 573
Joined: Jul 9th, '12, 10:50

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby jaywalker » Feb 1st, '17, 00:54

Aha! I see what you were getting at now - I think. The BIOS in the machine I normally use, the one we were talking about, presents the selected boot device to grub as (hd0) but device.map will tell me what the OS will call it. In the case of the machine I was using, the various device map files, and the OS to which each belongs, maintain a consistent translation of the devices in such a way that sda is always the internal drive, sdb is always its permanent companion and sdc is always the next drive found - currently the MGA6 drive.

So I always get:
Boot internal drive MGA5: (hd0) = sda, (hd1) = sdb, (hd2) = sdc
Boot usual external MGA3: (hd0) = sdb, (hd1)=sda, (hd2) = sdc
Boot test external MGA6: (hd0) = sdc, (hd1) = sda, (hd2) = sdb

This consistency may be purely due to the way I use the machine.

The sdb drive is normally the only external drive present at boot time. It is needed because the internal SSD, which is sda and the default MGA5 boot drive, is not very big. The sdb drive is 930-ish GiByte USB3 and holds various home partitions and /usr and /var and of course the swap drive for the others all to use.

Just for comparison I had a look at the boot setup on the machine currently running MGA6 from the same external drive. The device mapping there looks more like the nightmare you were describing from your experience with grub legacy. The external drive there is (hd0) but becomes sda, whereas the sole internal drive (which is NOT bootable) is (hd1) and sdf !!!!

This "new" MGA6 test-bed is quite an old PC and all I had to do to get it to boot was use the failsafe option to let me remove the nouveau blacklisting in MGA6 and reboot. It is now using the Nvidia proprietary drive and the nokmsboot option has been added correctly. So far there are no problems booting this hardware with either Xorg or Nvidia drivers so my next test will be to see what happens if I add a superfluous nokmsboot when using the nouveau driver.....I expect it to boot without a crash and that service_harddrake will do the "right" thing and remove the nokmsboot from the grub legacy command line via display_driver_helper.
jaywalker
 
Posts: 341
Joined: Nov 17th, '11, 02:38
Location: Belfast, Northern Ireland

Re: Why does "nokmsboot" option disappear during MGA6 boot?

Postby jaywalker » Feb 5th, '17, 19:27

And finally, if not, perhaps, conclusively... The answer is a very dirty hack. The reasoning is that I cannot determine what is at fault. The boot process crashes so quickly that it is impossible to read any useful information from the kernel messages, other than that the time offset is no more than 2 seconds. There may be a way to preserve or create a log which could be examined later, but I don't know how.

Booting the installation on hardware which can only use nouveau or nvidia is completely predictable and trouble-free. Moreover, adding the nokmsboot option to the boot of that machine is completely harmless. Leaving it out when it is needed is not a good idea as it will crash.

On the machine with the AMD A10 I tried setting the driver option in the xorg.conf file to ati, radeon and amdgpu. The amdgpu driver failed completely, which seemed odd as I expected that the "Northern Isles" gpu would be supported. None of these alternative driver selections produced any improvement of the machine's chances of booting correctly without the nokmsboot option.

Based on these observations I decided that the only thing I could do to stop the crashing was to stop display_driver_helper from reporting that the nokmsboot option should be removed. The magic bullet was on line 303 of the script where I forced the testing function to return 1 regardless of the test result. It works.

I am glad nobody else experiences this problem as it is a real turn-off and almost convinced me to skip another release (as I did with MGA4).

Richard
jaywalker
 
Posts: 341
Joined: Nov 17th, '11, 02:38
Location: Belfast, Northern Ireland

Re: [HACK WORKS]Why does "nokmsboot" disappear during MGA6 b

Postby rickst29 » Mar 15th, '17, 00:57

Your forum Thread is fascinating, and I have some similar problems. I have a roughly identical A10 system ('Kaveri'), but no graphics card (NVidia or ATI). Back on MGA-5.1, I have Kernel Parameter 'VGA=775' specified for my hi-res monitor. When I upgrade into Cauldron (using online urpmi upgrade), the console logging of runlevel 3 startup shows that my 'VGA' specification is kept intact at boot time. (Although it might be 'wiped out' on the 'Graphics Console' when switching to runlevel 5.) Most recently, any number of weird sequences happen when I attempt to login to a graphics desktop from the DM:

1) I can always start an initial Gnome session on X11 or Wayland, with compositing graphics.
2) I can logout and 'login' with another Gnome session *if and only if* I didn't try to run a Plasma session in between;
3) I can NEVER run a Plasma session first;
4) I can run ONE plasma session as my second desktop, after starting Gnome once and logging out.
5) Subsequent to running and terminating one Plasma Session, no graphical desktop works (until going back to runlevel 3, restarting X11 and starting off with gnome).

It's weird, and I'm still playing with possible workarounds. So far, the Xorg log reports no errors even falling into "fatal" situations. (Although I haven't cranked up the debug levels.)
rickst29
 
Posts: 33
Joined: May 30th, '11, 00:55

Re: [HACK WORKS]Why does "nokmsboot" disappear during MGA6 b

Postby jaywalker » Mar 15th, '17, 03:30

rickst29 wrote:I have a roughly identical A10 system ('Kaveri'), but no graphics card (NVidia or ATI).


In practice that is very much the same setup I use. My additional Nvidia card is very much a red herring; both because it drives no monitors and bothers Xorg not at all, and because I have not been able to determine the slightest difference in misbehaviour whether it is actually present or not.

rickst29 wrote:Back on MGA-5.1, I have Kernel Parameter 'VGA=775' specified for my hi-res monitor.


I blush to confess that I have never really understood how that parameter works, but I have looked it up and I see that you have set your screen to 8 bit colour @ 1280x1024 and mine is set for 800x600 - THAT explains a lot! I had given up on ever getting the default screen anywhere close to "nice" and have always relied on various arcane manipulations of the xorg.conf file, randr settings and, where appropriate and necessary, the Nvidia or ATI setup tools. I suppose the main reason for my behaviour is that I am always running on two or three screens of 1280x1024, 1680x1050 or 1920x1080.

rickst29 wrote:When I upgrade into Cauldron (using online urpmi upgrade), the console logging of runlevel 3 startup shows that my 'VGA' specification is kept intact at boot time.


This bit is interesting. I tried it out to see what happens for me but it is hard to divine from just the appearance of the text. With MGA5 and the nokmsboot option needed for use of the fglrx driver the kernel messages appear to fill the screen of a 1920x1080 monitor with lines which look like they could be stretched 1280x1024. The companion 1680x1050 screen seems to do the same and nothing changes until the change to runlevel 5 when both screens go blank in preparation for the X server.

Doing exactly the same thing in Cauldron (MGA6) is slightly different. Firstly the xorg driver must be used instead of fglrx as the proprietary driver has been dropped. That means, in theory, that the nokmsboot option should be droppped, but as discussed above I must use it to avoid an instant boot crash.

Now the kernel messages appear as they did in the MGA5 boot until ... not sure precisely when this happens, but at some point before the handover to runlevel 5 the screen mode of the 1920x1080 screen switches to what looks like native resolution. The second screen seems to stick with the stretched 1280x1024, then they go blank and the X server starts.

So far we seem to have similar experiences, though I am intrigued to know why you don't get the kms change to the text screen resolution which I get, just before the X server starts, but long enough before to see some messages appear at this higher resolution (tiny text). Perhaps it is because I am booting straight to the gui with "splash" and "quiet" disabled to see the result of the "vga=775" change, whereas you describe a runlevel 3 boot followed by a change to 5.

As for your graphical desktop nightmares, oh boy am I glad I fell out of love with KDE when they kicked their mature KDE 3.5 to the kerb and started again with the KDE 4 infant. I have been on LXDE ever since and I have never regretted the change. This has two consequences in the context of your issues; I know nothing about Wayland and its use of xwayland in support of X11 windows and on the other hand I have simply looked on in disinterested amusement at the never-ending story of Plasma. There has been some recent chatter about major improvements to Plasma in Cauldron from which I gather that, to some extent, it works now so if you haven't already done so, I would run an update.

Here's another thought, though not from any deep knowledge or understanding of what can go wrong, but from my experience, logging out to the DM usually kills the running X server and re-starts it so switching to runlevel 3 and back again to do the same thing should be superfluous. Does logging out of Gnome do the same thing to Wayland? Kill it?

Does Plasma need Wayland, or is it an option and if so, do you know whether you are using X11 or Wayland? I have the same question for Gnome actually as your description implies that you are able to use EITHER Wayland OR x11 for Gnome.

rickst29 wrote:1) I can always start an initial Gnome session on X11 or Wayland, with compositing graphics.


I didn't realise that was possible. Now you have me intrigued. I must look into this Wayland thing. I have an MGA6 sta2 image ready to stick on something. I'll try out this Gnome and Plasma stuff and see how far I get - at the weekend - perhaps.

By the way, I forgot to ask, are you managing to boot your A10 without the nokmsboot kernel option and what x11 driver have you configured for it?
jaywalker
 
Posts: 341
Joined: Nov 17th, '11, 02:38
Location: Belfast, Northern Ireland

Re: [HACK WORKS]Why does "nokmsboot" disappear during MGA6 b

Postby rickst29 » Mar 15th, '17, 07:59

from lsmod:

amdgpu 1531904 0
amdkfd 139264 1
amd_iommu_v2 20480 1 amdkfd
radeon 1486848 15
i2c_algo_bit 16384 2 amdgpu,radeon
drm_kms_helper 135168 2 amdgpu,radeon
ttm 90112 2 amdgpu,radeon
drm 335872 13 amdgpu,radeon,ttm,drm_kms_helper

I've tried blacklisting radeon, and creating an xorg with amdgpu - but it either leads to an immediate X-Server "nice" failure about wrong DRM version (amdgpu requires V3, radeon and lots of Mageia software seems to require V2); *or* hard lockup ("power button" on the case hard); *or* as is happening right now, loading of the 'ati' driver anyway (see above, no hooks into 'amdgpu', tons of hooks into the radeon module).

Once I've got the ATI 'Open Sauce' driver, there seems to be a sequence of "flip-good", "flip-bad" alternating between most session types. Gnome on wayland *or* X11 works on odd-numbered desktop sessions (first session, third session, 5th session). Even-numbered login sessions fail, falling back the DM after a short time. But Plasma only works on a single session: It *must* be the SECOND session, after logging into a Gnome session of either type (and yes, that's a situation where Gnome itself can't come up.)

It's definitely messed up. If I ask urpme to take out the drm-2 package, it offers do to delete about 1600 other packages with it. I think that "drm_kms_helper" is doing bad stuff, or a non-kernel program is doing bad stuff. Maybe both of them. Either way, I've got bad sessions with ATI/radeon most of the time, and no graphics at all trying to get amdpgu to run.
rickst29
 
Posts: 33
Joined: May 30th, '11, 00:55

Re: [HACK WORKS]Why does "nokmsboot" disappear during MGA6 b

Postby rickst29 » Mar 15th, '17, 21:49

To allow for some additional 'experiments', I've ordered a Radeon card. (A cheap one, only $50.)

If the card works OK, with "A-series" built-in graphics disabled, then we might have have an "active" --> "suspend" --> "disable" --> "enable" power management issue, trying to suspend on-chip graphics which SHOULDN'T be suspended/disabled. If the card fails in the exact same ways, then all ATI configurations in MGA-6 are borked... and I'll raise the priority and severity of my existing bug report. ( https://bugs.mageia.org/show_bug.cgi?id=20452 is currently "normal", and should probably be changed into a "P1" showstopper if all ATI graphics dies on unmodified MGA-6.)

Even if the card passes, but "A-series" APUs suffer these failures, it's probably a "P1" anyway:
  • Lot of notebooks come with AMD APUs, and can't be upgraded with a separate card;
  • ALL systems with APU graphics ran fine under MGA-5.x;
  • and the market share of such systems is likely to increase.
If I understand correctly, the newest proprietary Nvidia driver stacks requires, builds, and then uses a specific "drm-nvidia" module - not attempt to share the "standard" V2 drm module with other possible cards and drivers. Perhaps drm for 'amdgpu' needs to be implemented the same way, or left "unsupported" in MGA-6?
- - - - -
If amdgpu can be made to work AND I have too much time on my hands, then I might try to play with crossfire. :twisted:
Last edited by doktor5000 on Mar 15th, '17, 22:39, edited 1 time in total.
Reason: fixed bold tags
rickst29
 
Posts: 33
Joined: May 30th, '11, 00:55

Re: [HACK WORKS]Why does "nokmsboot" disappear during MGA6 b

Postby rickst29 » Mar 16th, '17, 06:17

I'm now running Plasma on 'amdgpu', still using Kaveri's integrated GPUs. A new kernel build probably made this possible.

It's buggy in two ways:

First, I still get lock-up from within the DM a lot of the time (after chosing a session type and trying to start it; same bug as the 'ATI' driver).
But I found a found a better way to recover and try again in the case of "hard" lock-ups with Plasma attempts. I switch to a root console VT (typically VT4 or VT5), get the PID of the XServer, and hit it with "kill -6". Since I'm still in Runlevel 5, it restarted almost instantly - and Plasma came up with the "fresh" X Server.

In this evening's particular sequence, I ran a few Gnome Sessions (with Wayland and with X) some of them worked, and some of them didn't, but none locked the DM. I then tried a Plasma session, and I got the mostly-unrecoverable 'lock-up': Killed X to provoke a new Server instance, and the Plasma session worked - as the first session on that "new" instance.

But Second, even when it does work, it takes a really log time to bring up the Plasma/KWin screen. I have no idea why. I'll try another instance (of X and Plasma) next, I'll come back and edit this reply to describe if the second instance of Plasma came up faster. (Bye - I'm going to log out, go to back to the DM, and try another Plasma login - which will probably need another "fresh" X-Server to work correctly). :twisted:
-----------
I'm back, really quick: no hang in DM upon selecting plasma, and the second session came up really fast. :D
Last edited by rickst29 on Mar 16th, '17, 07:24, edited 1 time in total.
rickst29
 
Posts: 33
Joined: May 30th, '11, 00:55

Re: [HACK WORKS]Why does "nokmsboot" disappear during MGA6 b

Postby rickst29 » Mar 16th, '17, 06:35

Great :? After restart, the first 'Plasma' fails hangs in the DM, and the second one (after 'kill -6' to the X11 task) worked -reasonably fast, too.

This is a mess.
rickst29
 
Posts: 33
Joined: May 30th, '11, 00:55

Re: [HACK WORKS]Why does "nokmsboot" disappear during MGA6 b

Postby jaywalker » Mar 20th, '17, 17:05

Sorry Rick, I have been distracted this weekend (long though it has been) by adsl issues at my brother's home and only now getting back to things bubbling in caldrons.

I have, however, confirmed that my A10-5800 is not supported by amdgpu, being Aruba, so I am stuck with the radeon driver for now. I see there has been a fair bit of activity on Cauldron with things plasmoid and gnomish so I am still looking forward to testing your mix of Wayland/X11/Plasma/Gnome problems.

For now though I think I may have a new insight on my boot problem after today's Plymouth update. I am pursuing that through a bug report.
jaywalker
 
Posts: 341
Joined: Nov 17th, '11, 02:38
Location: Belfast, Northern Ireland

Re: [HACK WORKS]Why does "nokmsboot" disappear during MGA6 b

Postby rickst29 » Mar 21st, '17, 01:50

Status report with unmodified software:

The new video card has arrived- but I'm now running OpenSuse "tumbleweed" on the "test" drive (as a production system). "Tumbleweed" rolls with the latest-and-greatest, much newer stuff than Cauldron, so that's a bit dangerous.) For Mageia, I will wait until I have a $3 drive bay adapter to arrive before I mount another "scratch" HD (size 2.5") into a 3.5" system bay for Mageia upgrade use - I'm very clumsy and likely to drop it unless it's screwed in. But, before I switched to OpenSuse, I found that the key factor is compositing: When I delete the feature from my KDE setup, Plamsa and KWin work. And when I add the feature to other Desktops (e.g., legacy Gnome with Compiz), then they break. It might also concern the kind of compositing extensions being used.

OpenSuse runs with no hacking, but I did a "full install", formatting /usr and /etc. (I temporarily copied my GPG and SSH keys into a /home backup, to restore after the build.) Plasma/Kwin runs with 100% of supported glitz. It lacks two features which I liked from KDE4, but that's an upstream issue.

So, when I've got a new "test" disk mounted: I'll see whether doing a full Mageia install prevents the problem (rather than online upgrade). I already tried DVD-base "update your system using STA-2, but it fell on it's face partway through .... and LOTS of software has changed in the last two weeks, I'm not going to try that route until a new ISO exists.
rickst29
 
Posts: 33
Joined: May 30th, '11, 00:55

Re: [HACK WORKS]Why does "nokmsboot" disappear during MGA6 b

Postby rickst29 » Mar 25th, '17, 22:01

I've now got the second video card. Interestingly - my 'fm2+' motherboard does not offer an option to "favor PCIe video card" while disabling on-chip video.

When I think about it, the reason why is obvious: The initial scan of basic resources (such as the memory bus) WILL find the on-chip video active and available. It may be impossible to avoid using this video for initial startup. But going into runlevel=5, it is possible to "force" X11 to use the PCIe card - via randr, and (with funky results...) even using a ximera-like xorg configuration.

I think that no one else is going to mess around with this scheme, and I don't want to spend to long playing with it. (One reason: My only mageia system has been the desktop, where the "good" monitor is 2560x1440. Mixing up my desktop with an adjacent "bad" monitor of only 1920x1080 is awkward, less attractive than switching desktops on the big one.) It is also likely that all possible implementations of Mageia-Cauldron Plasma on ati video (A10 video AND separate R7 video card) will have the same bug anyway: I suspect that Mageia-Cauldron defaults to unworkable "default parameters" to xrandr in the startup of a plasma session with compositing.

I may play with booting "test-mageia" only to runlevel 3 and investigating such an error path - when I next have time. Right now, I'm switched over to openSUSE "tumbleweed" with xfce desktop - and liking it. On a full install of OpenSUSE "Tumbleweed", Plasma has no issues with the default xorg configuration - even though I've now abandoned Plasma, due to KF5 abandoning important features of KDE4 desktop appearance and desktop switching. (A lead designer insists that that "activity switching" replaces desktop switching, even though that's manifestly untrue.) Plasma 5.9 doesn't work for me, although I might switch back to Plasma when some important features (switch desktops on the edge, and different backgrounds for each desktop) are brought back.

Plymouth aborts the graphics screen on startup (we appear to have a similar problem in Cauldron, where the graphical boot screen goes blank with 3 tiny characters. But after that, it all works. The boot parameters I have with openSUSE are:
Code: Select all
video=2560x1440 resume=/dev/disk/by-uuid/e2a0db27-c3fd-415b-9352-6139b3cab99f quiet showopts
rickst29
 
Posts: 33
Joined: May 30th, '11, 00:55


Return to Testing : Alpha, Beta, RC and Cauldron

Who is online

Users browsing this forum: No registered users and 1 guest

cron