[SOLVED] boot loop after kernel upgrade using nvidia driver

[SOLVED] boot loop after kernel upgrade using nvidia driver

Postby killerkaninchen » Apr 10th, '18, 22:10

Hi all

I use Mageia 6 with an Nvidia Geforce GTX 1050. Until kernel 4.14.20 I could use the proprietary driver after upgrading to 4.14.25+ I was prompted a message that the driver has changed and must restart. Doing so it shows the message after every reboot. Aborting lets me login to textmode. But startx does not work and I can't log in. I used the recommended graphical tool (forgot the name) to setup the nvidia-driver.

Logfile:
Code: Select all
[   183.476]
X.Org X Server 1.19.5
Release Date: 2017-10-12
[   183.476] X Protocol Version 11, Revision 0
[   183.477] Build Operating System: ecosse 4.4.88-server-1.mga5
[   183.477] Current Operating System: Linux localhost.localdomain 4.14.20-desktop-1.mga6 #1 SMP Sun Feb 18 01:22:02 UTC 2018 x86_64
[   183.477] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.14.20-desktop-1.mga6 root=UUID=31e8e1e4-928f-43d5-af58-4e4c3aebef3b ro splash quiet noiswmd resume=UUID=1b4eec2f-9576-4c5c-a87b-df6eb1b957ce audit=0
[   183.477] Build Date: 13 October 2017  04:36:29PM
[   183.477] Build ID: x11-server 1.19.5-1.1.mga6
[   183.477] Current version of pixman: 0.34.0
[   183.477]    Before reporting problems, check https://bugs.mageia.org
   to make sure that you have the latest version.
[   183.477] Markers: (--) probed, (**) from config file, (==) default setting,
   (++) from command line, (!!) notice, (II) informational,
   (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[   183.477] (==) Log file: "/var/log/Xorg.0.log", Time: Tue Apr 10 21:21:27 2018
[   183.477] (==) Using config file: "/etc/X11/xorg.conf"
[   183.477] (==) Using config directory: "/etc/X11/xorg.conf.d"
[   183.477] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[   183.477] (==) ServerLayout "layout1"
[   183.477] (**) |-->Screen "screen1" (0)
[   183.477] (**) |   |-->Monitor "monitor1"
[   183.477] (**) |   |-->Device "device1"
[   183.477] (**) |-->Input Device "Keyboard0"
[   183.477] (**) |-->Input Device "Mouse0"
[   183.477] (**) Option "DontZap" "True"
[   183.477] (**) Option "AllowMouseOpenFail"
[   183.477] (**) Option "Xinerama" "0"
[   183.477] (==) Automatically adding devices
[   183.477] (==) Automatically enabling devices
[   183.477] (==) Automatically adding GPU devices
[   183.477] (==) Automatically binding GPU devices
[   183.477] (==) Max clients allowed: 256, resource mask: 0x1fffff
[   183.477] (==) FontPath set to:
   catalogue:/etc/X11/fontpath.d,
   built-ins
[   183.477] (**) ModulePath set to "/usr/lib64/xorg/extra-modules,/usr/lib64/xorg/modules,/usr/lib/xorg/extra-modules,/usr/lib/xorg/modules"
[   183.477] (**) Extension "Composite" is disabled
[   183.477] (WW) Hotplugging is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' will be disabled.
[   183.477] (WW) Disabling Keyboard0
[   183.477] (WW) Disabling Mouse0
[   183.477] (II) Loader magic: 0x80ed80
[   183.477] (II) Module ABI versions:
[   183.477]    X.Org ANSI C Emulation: 0.4
[   183.477]    X.Org Video Driver: 23.0
[   183.477]    X.Org XInput driver : 24.1
[   183.477]    X.Org Server Extension : 10.0
[   183.480] (--) using VT number 1

[   183.480] (II) systemd-logind: logind integration requires -keeptty and -keeptty was not provided, disabling logind integration
[   183.481] (--) PCI:*(0:1:0:0) 10de:1c81:10de:11c0 rev 161, Mem @ 0xf6000000/16777216, 0xe0000000/268435456, 0xf0000000/33554432, I/O @ 0x0000e000/128, BIOS @ 0x????????/131072
[   183.481] (WW) Open ACPI failed (/var/run/acpid.socket) (Connection refused)
[   183.481] (II) "glx" will be loaded by default.
[   183.481] (II) LoadModule: "v4l"
[   183.483] (II) Loading /usr/lib64/xorg/modules/drivers/v4l_drv.so
[   183.483] (II) Module v4l: vendor="X.Org Foundation"
[   183.483]    compiled for 1.19.2, module version = 0.1.1
[   183.483]    ABI class: X.Org Video Driver, version 23.0
[   183.483] (II) LoadModule: "glx"
[   183.483] (II) Loading /usr/lib64/xorg/extra-modules/libglx.so
[   183.489] (II) Module glx: vendor="NVIDIA Corporation"
[   183.489]    compiled for 4.0.2, module version = 1.0.0
[   183.489]    Module class: X.Org Server Extension
[   183.489] (II) NVIDIA GLX Module  390.42  Sat Mar  3 03:25:37 PST 2018
[   183.489] (II) LoadModule: "nouveau"
[   183.490] (II) Loading /usr/lib64/xorg/modules/drivers/nouveau_drv.so
[   183.490] (II) Module nouveau: vendor="X.Org Foundation"
[   183.490]    compiled for 1.19.3, module version = 1.0.15
[   183.490]    Module class: X.Org Video Driver
[   183.490]    ABI class: X.Org Video Driver, version 23.0
[   183.490] (II) v4l driver for Video4Linux overlay mode (V4L2)
[   183.491] (II) NOUVEAU driver
[   183.491] (II) NOUVEAU driver for NVIDIA chipset families :
[   183.491]    RIVA TNT        (NV04)
[   183.491]    RIVA TNT2       (NV05)
[   183.491]    GeForce 256     (NV10)
[   183.491]    GeForce 2       (NV11, NV15)
[   183.491]    GeForce 4MX     (NV17, NV18)
[   183.491]    GeForce 3       (NV20)
[   183.491]    GeForce 4Ti     (NV25, NV28)
[   183.491]    GeForce FX      (NV3x)
[   183.491]    GeForce 6       (NV4x)
[   183.491]    GeForce 7       (G7x)
[   183.491]    GeForce 8       (G8x)
[   183.491]    GeForce GTX 200 (NVA0)
[   183.491]    GeForce GTX 400 (NVC0)
[   183.491] (WW) xf86OpenConsole: setpgid failed: Operation not permitted
[   183.491] (WW) xf86OpenConsole: setsid failed: Operation not permitted
[   183.494] (WW) Falling back to old probe method for v4l
[   183.494] (II) v4l: Initiating device probe
[   185.127] (EE) [drm] Failed to open DRM device for pci:0000:01:00.0: -19
[   185.127] (WW) Falling back to old probe method for v4l
[   185.127] (II) v4l: Initiating device probe
[   185.127] (EE) No devices detected.
[   185.127] (EE)
Fatal server error:
[   185.127] (EE) no screens found(EE)
[   185.127] (EE)
Please consult the Mageia support
    at https://bugs.mageia.org
 for help.
[   185.128] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[   185.128] (EE)
[   185.130] (EE) Server terminated with error (1). Closing log file.


Installing latest nvidia-driver from the nvidia-website did not change anything.

Using nouveau-driver I can boot and log into Mageia.
(At boot it hangs for maybe one minute and prompts:
Code: Select all
]1.791947] nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 122124

Maybe this is related to the nvidia-problem...?)

Can anyone help me using the proprietary driver?

Thanks in advance
Killerkaninchen
Last edited by killerkaninchen on Apr 14th, '18, 14:55, edited 1 time in total.
killerkaninchen
 
Posts: 14
Joined: Apr 10th, '18, 21:26

Re: boot loop after kernel upgrade using nvidia driver

Postby doktor5000 » Apr 10th, '18, 22:29

Do you use grub or grub2? And do you have nokmsboot as boot option ?
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 17629
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: boot loop after kernel upgrade using nvidia driver

Postby killerkaninchen » Apr 11th, '18, 09:33

I use grub2 and no special boot options.
killerkaninchen
 
Posts: 14
Joined: Apr 10th, '18, 21:26

Re: boot loop after kernel upgrade using nvidia driver

Postby doktor5000 » Apr 11th, '18, 18:14

Well, if you don't have nokmsboot as boot option then that is already your issue. It should be present in /etc/default/grub in the GRUB_CMDLINE_LINUX_DEFAULT line.
See https://bugs.mageia.org/show_bug.cgi?id=21250 and https://bugs.mageia.org/show_bug.cgi?id=21263
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 17629
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: boot loop after kernel upgrade using nvidia driver

Postby killerkaninchen » Apr 11th, '18, 21:33

OK - I'll have a look and try.
killerkaninchen
 
Posts: 14
Joined: Apr 10th, '18, 21:26

Re: boot loop after kernel upgrade using nvidia driver

Postby killerkaninchen » Apr 11th, '18, 23:40

I installed nvidia-driver via XFdrake.
I edited grub
Code: Select all
GRUB_CMDLINE_LINUX_DEFAULT=" splash quiet noiswmd resume=UUID=1b4eec2f-9576-4c5c-a87b-df6eb1b957ce audit=0 nokmsboot"

and updated it
Code: Select all
sudo update-grub2

After a reboot into kernel 4.14.30 the message about the changed driver was gone but the system seemed frozen. On tty1 and 2 nothing happened but on tty3 I could login and I blacklisted nouveau
Code: Select all
/etc/modprobe.d/blacklist.conf
blacklist nouveau

and rebooted again. Nothing changed.
Tried startx:
Code: Select all
[   246.483] _XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
[   246.483] _XSERVTransMakeAllCOTSServerListeners: server already running
[   246.483] (EE)
Fatal server error:
[   246.483] (EE) Cannot establish any listening sockets - Make sure an X server isn't already running(EE)
[   246.483] (EE)
Please consult the Mageia support
    at https://bugs.mageia.org
 for help.
[   246.483] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[   246.484] (EE)
[   246.485] (EE) Server terminated with error (1). Closing log file.


Where is the mistake?
killerkaninchen
 
Posts: 14
Joined: Apr 10th, '18, 21:26

Re: boot loop after kernel upgrade using nvidia driver

Postby morgano » Apr 12th, '18, 09:20

Do you find something interesting in /var/log/Xorg.0.log ?
Mandriva since 2006, Mageia 2011 at home & work. Thinkpad T40, T43, T400, T510, Dell M4400, M6300, Acer Aspire 7. Workstation using LVM, LUKS, VirtualBox, BOINC
morgano
 
Posts: 1306
Joined: Jun 15th, '11, 17:51
Location: Kivik, Sweden

Re: boot loop after kernel upgrade using nvidia driver

Postby arnesp » Apr 12th, '18, 20:25

I am using proprietary drivers for a Geforce GTX 1050 TI and had no problems updating to kernel 4.14.25 (nvidia driver was updated to 390.42 in the same session) and later to kernel 4.14.30, so it would appear that your nvidia installation somehow got broken.

One thing to be aware of, is that selecting the proprietary drivers through XFdrake does not fix an already existing broken nvidia installation.
The following approach has successfully restored nvidia drivers for me several times:
- boot using nouveau drivers
- uninstall nvidia packages (e.g. using rpmdrake)
- select proprietary drivers using XFdrake
- reboot

If XFdrake is started from a Konsole window, you should see progress reports there, from download and installation of 3 nvidia packages through build and installation of 4 kernel modules.

As doktor5000 points out, the nokmsboot flag is needed for successful boot with the nvidia driver.
Blacklisting nouveau should, however, not be necessary.
arnesp
 
Posts: 60
Joined: Aug 6th, '15, 00:41

Re: boot loop after kernel upgrade using nvidia driver

Postby killerkaninchen » Apr 12th, '18, 22:35

Thanks. I'll try tomorrow.
killerkaninchen
 
Posts: 14
Joined: Apr 10th, '18, 21:26

Re: boot loop after kernel upgrade using nvidia driver

Postby killerkaninchen » Apr 14th, '18, 00:14

I changed to nouveau driver using xfdrake, removed blacklisting of nouveau driver. After reboot the system says "searching for new hardware components" but it didn't reach the graphical login. So I changed back to nvidia driver using urpmi and copying back the xorg.conf.backup. But booting with nvidia driver and kernel 4.14.20 did not change anything. After that I tried several times with the different drivers and kernel but I don't get to the graphical login.

I'm on another computer now so I can't give you the logs at the time.
killerkaninchen
 
Posts: 14
Joined: Apr 10th, '18, 21:26

Re: boot loop after kernel upgrade using nvidia driver

Postby arnesp » Apr 14th, '18, 12:26

Sorry, I forgot to mention that booting with nouveau works best without the nokmsboot flag.
If it is set, booting hangs for about 1 minute showing "searching for new hardware components" .
Did you wait longer than that before giving up on the nouveau boot?
If you can't get a graphical login, you may try booting to "runlevel 3" (press e on the grub menu and add 3 to the boot line) and do the nvidia removal using urpme on the command line. XFdrake also works in a console.

EDIT!
Just tested, running XFdrake at runlevel 3 (or in a "ctrl-alt Fx" console) didn't restore nvidia drivers!
However, just uninstalling the nvidia packages and rebooting enabled me to get a desktop using nouveau drivers. Using XFdrake here to select proprietary drivers did allow me to reboot into a desktop using nvidia drivers.
Last edited by arnesp on Apr 14th, '18, 15:32, edited 2 times in total.
arnesp
 
Posts: 60
Joined: Aug 6th, '15, 00:41

Re: boot loop after kernel upgrade using nvidia driver

Postby killerkaninchen » Apr 14th, '18, 14:12

Booting without "nokmsboot" works for nouveau.
Removed nvidia-packages, moved xorg.conf and then used XFdrake to install them again. Output:
Code: Select all
 
killerkaninchen@localhost:~$ XFdrake
Too late to run INIT block at /usr/lib/perl5/vendor_perl/5.22.2/x86_64-linux-thread-multi/Glib/Object/Introspection.pm line 257.
Ignore the following Glib::Object::Introspection & Gtk3 warnings
Subroutine Gtk3::main redefined at /usr/lib/perl5/vendor_perl/5.22.3/Gtk3.pm line 525.
getting exclusive lock on rpm
getting lock on urpmi
using mirror http://mirror2.tuxinator.org/mageia/distrib/6/x86_64
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Core Release.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Nonfree Release.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Core Release2.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Core Updates.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Nonfree Release2.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Nonfree Updates.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Tainted Release.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Tainted Updates.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Core 32bit Release.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Core 32bit Updates.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Nonfree 32bit Release.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Nonfree 32bit Updates.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Tainted 32bit Release.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/synthesis.hdlist.Tainted 32bit Updates.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Core Release3/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Core Updates2/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Nonfree Release3/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Nonfree Updates2/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Tainted Release2/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Tainted Updates2/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Core 32bit Release2/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Core 32bit Updates2/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Nonfree 32bit Release2/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Nonfree 32bit Updates2/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Tainted 32bit Release2/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/Tainted 32bit Updates2/synthesis.hdlist.cz] wird gelesen
Synthese-Datei [/var/lib/urpmi/google-chrome/synthesis.hdlist.cz] wird gelesen
Um die Abhängigkeiten zu erfüllen, werden die folgenden Pakete installiert:
=> ok(auto)


Holen der RPM-Dateien von Medium »Nonfree Updates« …
    http://mirror.netcologne.de/mageia/distrib/6/x86_64/media/nonfree/updates/x11-driver-video-nvidia-current-390.42-1.mga6.nonfree.x86_64.rpm
    http://mirror.netcologne.de/mageia/distrib/6/x86_64/media/nonfree/updates/nvidia-current-doc-html-390.42-1.mga6.nonfree.x86_64.rpm                                                                                                       
    http://mirror.netcologne.de/mageia/distrib/6/x86_64/media/nonfree/updates/dkms-nvidia-current-390.42-1.mga6.nonfree.x86_64.rpm                                                                                                           
                                                                          x11-driver-video-nvidia-current-390.42-1.mga6.nonfree.x86_64.rpm nvidia-current-doc-html-390.42-1.mga6.nonfree.x86_64.rpm dkms-nvidia-current-390.42-1.mga6.nonfree.x86_64.rpm wurde geholt
… Holen beendet
x11-driver-video-nvidia-current-390.42-1.mga6.nonfree.x86_64.rpm nvidia-current-doc-html-390.42-1.mga6.nonfree.x86_64.rpm dkms-nvidia-current-390.42-1.mga6.nonfree.x86_64.rpm von /var/cache/urpmi/rpms wird installiert
starting installing packages
Vorgang zum Installieren auf / gestellt (entfernen=0, installieren=0, aktualisieren=3)

Creating symlink /var/lib/dkms/nvidia-current/390.42-1.mga6.nonfree/source ->
                 /usr/src/nvidia-current-390.42-1.mga6.nonfree

DKMS: add Completed.

Preparing kernel 4.14.30-desktop-3.mga6 for module build:
(This is not compiling a kernel, just preparing kernel symbols)
Storing current .config to be restored when complete
Running Generic preparation routine
make mrproper.....
using /proc/config.gz
make oldconfig....
make prepare....

Building module:
cleaning build area....
'make' -j4 SYSSRC=/lib/modules/4.14.30-desktop-3.mga6/build modules..........
cleaning build area....
cleaning kernel tree (make mrproper)....

DKMS: build Completed.

nvidia-current.ko.xz:
 - Installation
   - Installing to /lib/modules/4.14.30-desktop-3.mga6/dkms/drivers/char/drm/

nvidia-modeset.ko.xz:
 - Installation
   - Installing to /lib/modules/4.14.30-desktop-3.mga6/dkms/drivers/char/drm/

nvidia-drm.ko.xz:
 - Installation
   - Installing to /lib/modules/4.14.30-desktop-3.mga6/dkms/drivers/char/drm/

nvidia-uvm.ko.xz:
 - Installation
   - Installing to /lib/modules/4.14.30-desktop-3.mga6/dkms/drivers/char/drm/

depmod.....

DKMS: install Completed.
Installierte RPMs (x11-driver-video-nvidia-current-390.42-1.mga6.nonfree.x86_64.rpm nvidia-current-doc-html-390.42-1.mga6.nonfree.x86_64.rpm dkms-nvidia-current-390.42-1.mga6.nonfree.x86_64.rpm) werden von /var/cache/urpmi/rpms entfernt
----------------------------------------------------------------------
Mehr Informationen über das Paket x11-driver-video-nvidia-current-390.42-1.mga6.nonfree.x86_64
This driver is for GeForce 420 and later cards.

Use XFdrake to configure X to use the correct NVIDIA driver. Any needed
packages will be automatically installed if not already present.
1. Run XFdrake as root.
2. Go to the Graphics Card list.
3. Select your card (it is usually already autoselected).
4. Answer any questions asked and then quit.

If you do not want to use XFdrake, see README.manual-setup.

----------------------------------------------------------------------
unlocking urpmi database
unlocking rpm database

No errors here I think.
During installation an error message appeared in another window (german translated to english):
Code: Select all
(EE)
Fatal server error:

Try to change some options.

I left the options unchanged. Had this several times before and despite this nvidia driver worked with kernel 4.14.20.
Blacklisted nouveau, set nokmsboot option and updated grub.
Now I will reboot...
...and now it works!

Thanks for your help!
Killerkaninchen
killerkaninchen
 
Posts: 14
Joined: Apr 10th, '18, 21:26


Return to Video

Who is online

Users browsing this forum: No registered users and 1 guest