Trouble with Mageia1 on HP DL320 and DL360

This forum is dedicated to advanced help and support :

Ask here your questions about advanced usage of Mageia. For example you may post here all your questions about network and automated installs, complex server configurations, kernel tuning, creating your own Mageia mirrors, and all tasks likely to be touchy even for skilled users.

Trouble with Mageia1 on HP DL320 and DL360

Postby taraj » Mar 6th, '12, 08:11

Hello,

I have a custom Mageia1 build that we use to build machines in the 'field', it is all auto install, just put dvd and select the build, use a cfg file in the i586/configs directory. It works fine on HP desktop PC, e.g. dc7700, 7800. But also have to have an build install on the HP proliant DL320 G6 and DL360G7 but i am having lots of trouble with these servers freezing during boot-up. Has anyone else successfully used them?

I have 2 pci cards in the DL360, one is for a matrox extio f2408 and the other is a sync serial comms card. The DL320 has no extra hardware.
We have had trouble with IRQs before but if i take the comms card out i still have the same problem.


Some of the various screen output that i get includes the following: (there is not a constant place that it always freezes) (all hand typed as can't login and in safe mode can't mount usb stick)


[end trace....]
EDAC MC:Ver 2.1.0 May 22 2011
iTCO_vendor support: vendor_support=0
udevd-work[224]: `/sbin/modprobe -bv pci:....... unexpected exit with status 0x0009`


EDAC i7core: Driver Loaded
iTCO_wdt: Intel TCO watchdog timer driver v1.06
unable to reset NO_REBOOT flag, device disabled by hardware/BIOS



udevd-work[170] worker [228] failed while handling /devices/pci .... 4/0000:02:00.2

This 02:00.2 is "system peripheral: Hewlett-packard company iLO3 management processor support and messaging"

I have disabled iLO3 in BIOS and still same problem.


DL320 reguarly gets stuck on tg3


I can't get into safe mode either on the DL360. In normal mode I have tried various boot parameters, to start with just acpi=strict and nouveau.blacklist=1 was working fine, but this stopped working for some reason. Then tired added any sort of options like radeon.modset=0, pci_use_crs but no luck. On the DL320 with all these parameters I can get into safe mode and have a look at dmesg, only thing that might be relevant was:

Firmware Bug: the BIOS has corrupted hw-PMU resources
PME# disabled


Not sure what this means or how to fix it.

Any help greatly appreciated.

cheers,
Tara

p.s. a custom build on Mandriva 2010.2 worked on the DL320 G6.


EDIT: Got in and have attached the log files. Had to use: Kernel command line: BOOT_IMAGE=linux root=UUID=07dff7ef-bceb-4eb4-9d1c-373e180c7dc8 resume=UUID=703a2b81-41dd-47b3-84e2-da39bdff4605 acpi=off nouveau.blacklist=0 modprobedebug udevdebug pci=use_crs
Not sure if this will work everytime, how bad is it to use all this options?
Attachments
dl360-var-log-kernel-infolog.txt
(81.86 KiB) Downloaded 249 times
dl360-var-log-kernel-infolog.txt
(81.86 KiB) Downloaded 266 times
dl360-var-log-kernel-errorslog.txt
(371 Bytes) Downloaded 233 times
taraj
 
Posts: 4
Joined: Mar 6th, '12, 07:54

Re: Trouble with Mageia1 on HP DL320 and DL360

Postby tmb » Mar 10th, '12, 22:58

2 things first... do you use the latest bios ?

Have you tried with the latest kernel in updates, the: 2.6.38.8-server-10.mga ?
User avatar
tmb
 
Posts: 21
Joined: Mar 29th, '11, 22:13
Location: Finland

Re: Trouble with Mageia1 on HP DL320 and DL360

Postby taraj » Mar 15th, '12, 01:27

Hi

I upgraded to 2.6.38.8-server-10.mga and installed the BIOS update on this page
http://h20000.www2.hp.com/bizsupport/Te ... b04c60cb56

but not really sure what BIOS updates I might need?? Only did BIOS update to DL360G7.

After the kernel update I could remove the nmi_watchdog=1 from the boot parameters and it seemed to boot ok on both the DL320 and DL360. The acpi=off is still required to boot up. If I change it to acpi=strict it does not work - same as removing it completely.

I have attached screenshots of where both servers fail during boot up, hopefully this will help??

Thank you!!
Attachments
DL360G7.jpg
DL360G7_bootup_failure
DL360G7.jpg (681.97 KiB) Viewed 5037 times
DL320G6.jpg
DL320G6_bootup_failure
DL320G6.jpg (777.49 KiB) Viewed 5037 times
taraj
 
Posts: 4
Joined: Mar 6th, '12, 07:54

Re: Trouble with Mageia1 on HP DL320 and DL360

Postby linuxero » Mar 15th, '12, 19:52

This might be the same problem I am having with a Packard Bell Easynote laptop as far as I can see.

Could you please try another distro whatsoever to check whether it installs or not?

I am doubting the latest versions of udev have some kind of trouble with some components!

If you have the same problem with other distros try an older release of some distro to confirm.

Thank you
linuxero
 
Posts: 345
Joined: Oct 7th, '11, 15:50

Re: Trouble with Mageia1 on HP DL320 and DL360

Postby taraj » Mar 20th, '12, 06:06

Hi,

I have tried Fedora 14 and Fedora 15 and no luck with either of them:

Fedora 14 gives:

mounting /tmp as tmpfs...done
[4.911971] NMI: IOCK error (debug interrupt?)
process udevd(pid 194.....
Stack:
Call Trace:
running install
running /sbin/loader


Fedora 15 (2.6.38.6) gives similar stuff to Mageia 2.6.38.8-server-10 about modprobe 0x0009, iTCO_wdt and NO_REBOOT

Any ideas or fixes anyone????

cheers.
taraj
 
Posts: 4
Joined: Mar 6th, '12, 07:54

Re: Trouble with Mageia1 on HP DL320 and DL360

Postby linuxero » Mar 20th, '12, 19:45

Well; I know how it feels! I am having the very same problem and still trying to go over it. An older release would undoubtedly work fine. Debian might hang up every 3-5 minutes. For me Ubuntu 9.04 worked great.

Guess we both need a udev guru here :) but if I find something out it'll show on this forum :)

Keep the thread updated please :D
linuxero
 
Posts: 345
Joined: Oct 7th, '11, 15:50

Re: Trouble with Mageia1 on HP DL320 and DL360

Postby doktor5000 » Mar 20th, '12, 23:50

Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18054
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: Trouble with Mageia1 on HP DL320 and DL360

Postby linuxero » Mar 21st, '12, 16:53



Thanks, this would be great
linuxero
 
Posts: 345
Joined: Oct 7th, '11, 15:50

Re: Trouble with Mageia1 on HP DL320 and DL360

Postby taraj » Mar 22nd, '12, 02:16

Hi,

I made udev_log="debug" adnd rebooted removing the acpi=off boot parameter and got the following output where booting then stopped/frooze

udevd-work[442]: RUN 'socket:@org/freedesktop/hal/udev_event' /lib/udev/rules.d/90-hal.rules:2

udevd-work[442]: passed -1 bytes to socket monitor 0x9cd3f28
udevd-work[442]: passed -1 bytes to netlink monitor 0x....
seq 1591 processed with 0
seq 1591 done with 0
worker[190] exit
worker[190] unexpectedly returned with status 0x0100

failed while handling 'devices/pci.....1c.4/000:02:00.2'



I know that '02:00.2' is the Integrated Lights out 3, which i have disabled.

/lib/udev/rules.d/90-hal.rules has as expected the following line:
RUN+="socket:@/org/freedesktop/hal/udev_event

Each time i reboot with acpi=off i get different output at the place where boot stops. I have attached a few images.

If i boot up with acpi=off in the boot parameters there is nowhere that the udev log stuff is being written, dmesg has pretty much nothing in it.

Any more ideas anyone??

Thanks for helping!
Attachments
IMG027.jpg
IMG027.jpg (525.96 KiB) Viewed 4981 times
IMG025.jpg
IMG025.jpg (534.71 KiB) Viewed 4981 times
IMG024.jpg
IMG024.jpg (449.18 KiB) Viewed 4981 times
taraj
 
Posts: 4
Joined: Mar 6th, '12, 07:54

Re: Trouble with Mageia1 on HP DL320 and DL360

Postby linuxero » Mar 22nd, '12, 16:12

Still looking for a udev guru..I'll try contact the developper..

But do you agree that all points out to udev?
linuxero
 
Posts: 345
Joined: Oct 7th, '11, 15:50

Re: Trouble with Mageia1 on HP DL320 and DL360

Postby wintpe » Mar 22nd, '12, 18:11

going back to the distros i install RH6.2/5.8 on these hosts (or very similar) DL380g6

try centos 5.6 as its an older kernel.

this should work no problem, if thats still giving issues would suspect hw

regards peter
Redhat 6 Certified Engineer (RHCE)
Sometimes my posts will sound short, or snappy, however its realy not my intention to offend, so accept my apologies in advance.
wintpe
 
Posts: 1204
Joined: May 22nd, '11, 17:08
Location: Rayleigh,, Essex , UK


Return to Advanced support

Who is online

Users browsing this forum: No registered users and 1 guest

cron