Page 1 of 1

Trouble with Mageia1 on HP DL320 and DL360

PostPosted: Mar 6th, '12, 08:11
by taraj
Hello,

I have a custom Mageia1 build that we use to build machines in the 'field', it is all auto install, just put dvd and select the build, use a cfg file in the i586/configs directory. It works fine on HP desktop PC, e.g. dc7700, 7800. But also have to have an build install on the HP proliant DL320 G6 and DL360G7 but i am having lots of trouble with these servers freezing during boot-up. Has anyone else successfully used them?

I have 2 pci cards in the DL360, one is for a matrox extio f2408 and the other is a sync serial comms card. The DL320 has no extra hardware.
We have had trouble with IRQs before but if i take the comms card out i still have the same problem.


Some of the various screen output that i get includes the following: (there is not a constant place that it always freezes) (all hand typed as can't login and in safe mode can't mount usb stick)


[end trace....]
EDAC MC:Ver 2.1.0 May 22 2011
iTCO_vendor support: vendor_support=0
udevd-work[224]: `/sbin/modprobe -bv pci:....... unexpected exit with status 0x0009`


EDAC i7core: Driver Loaded
iTCO_wdt: Intel TCO watchdog timer driver v1.06
unable to reset NO_REBOOT flag, device disabled by hardware/BIOS



udevd-work[170] worker [228] failed while handling /devices/pci .... 4/0000:02:00.2

This 02:00.2 is "system peripheral: Hewlett-packard company iLO3 management processor support and messaging"

I have disabled iLO3 in BIOS and still same problem.


DL320 reguarly gets stuck on tg3


I can't get into safe mode either on the DL360. In normal mode I have tried various boot parameters, to start with just acpi=strict and nouveau.blacklist=1 was working fine, but this stopped working for some reason. Then tired added any sort of options like radeon.modset=0, pci_use_crs but no luck. On the DL320 with all these parameters I can get into safe mode and have a look at dmesg, only thing that might be relevant was:

Firmware Bug: the BIOS has corrupted hw-PMU resources
PME# disabled


Not sure what this means or how to fix it.

Any help greatly appreciated.

cheers,
Tara

p.s. a custom build on Mandriva 2010.2 worked on the DL320 G6.


EDIT: Got in and have attached the log files. Had to use: Kernel command line: BOOT_IMAGE=linux root=UUID=07dff7ef-bceb-4eb4-9d1c-373e180c7dc8 resume=UUID=703a2b81-41dd-47b3-84e2-da39bdff4605 acpi=off nouveau.blacklist=0 modprobedebug udevdebug pci=use_crs
Not sure if this will work everytime, how bad is it to use all this options?

Re: Trouble with Mageia1 on HP DL320 and DL360

PostPosted: Mar 10th, '12, 22:58
by tmb
2 things first... do you use the latest bios ?

Have you tried with the latest kernel in updates, the: 2.6.38.8-server-10.mga ?

Re: Trouble with Mageia1 on HP DL320 and DL360

PostPosted: Mar 15th, '12, 01:27
by taraj
Hi

I upgraded to 2.6.38.8-server-10.mga and installed the BIOS update on this page
http://h20000.www2.hp.com/bizsupport/Te ... b04c60cb56

but not really sure what BIOS updates I might need?? Only did BIOS update to DL360G7.

After the kernel update I could remove the nmi_watchdog=1 from the boot parameters and it seemed to boot ok on both the DL320 and DL360. The acpi=off is still required to boot up. If I change it to acpi=strict it does not work - same as removing it completely.

I have attached screenshots of where both servers fail during boot up, hopefully this will help??

Thank you!!

Re: Trouble with Mageia1 on HP DL320 and DL360

PostPosted: Mar 15th, '12, 19:52
by linuxero
This might be the same problem I am having with a Packard Bell Easynote laptop as far as I can see.

Could you please try another distro whatsoever to check whether it installs or not?

I am doubting the latest versions of udev have some kind of trouble with some components!

If you have the same problem with other distros try an older release of some distro to confirm.

Thank you

Re: Trouble with Mageia1 on HP DL320 and DL360

PostPosted: Mar 20th, '12, 06:06
by taraj
Hi,

I have tried Fedora 14 and Fedora 15 and no luck with either of them:

Fedora 14 gives:

mounting /tmp as tmpfs...done
[4.911971] NMI: IOCK error (debug interrupt?)
process udevd(pid 194.....
Stack:
Call Trace:
running install
running /sbin/loader


Fedora 15 (2.6.38.6) gives similar stuff to Mageia 2.6.38.8-server-10 about modprobe 0x0009, iTCO_wdt and NO_REBOOT

Any ideas or fixes anyone????

cheers.

Re: Trouble with Mageia1 on HP DL320 and DL360

PostPosted: Mar 20th, '12, 19:45
by linuxero
Well; I know how it feels! I am having the very same problem and still trying to go over it. An older release would undoubtedly work fine. Debian might hang up every 3-5 minutes. For me Ubuntu 9.04 worked great.

Guess we both need a udev guru here :) but if I find something out it'll show on this forum :)

Keep the thread updated please :D

Re: Trouble with Mageia1 on HP DL320 and DL360

PostPosted: Mar 20th, '12, 23:50
by doktor5000

Re: Trouble with Mageia1 on HP DL320 and DL360

PostPosted: Mar 21st, '12, 16:53
by linuxero


Thanks, this would be great

Re: Trouble with Mageia1 on HP DL320 and DL360

PostPosted: Mar 22nd, '12, 02:16
by taraj
Hi,

I made udev_log="debug" adnd rebooted removing the acpi=off boot parameter and got the following output where booting then stopped/frooze

udevd-work[442]: RUN 'socket:@org/freedesktop/hal/udev_event' /lib/udev/rules.d/90-hal.rules:2

udevd-work[442]: passed -1 bytes to socket monitor 0x9cd3f28
udevd-work[442]: passed -1 bytes to netlink monitor 0x....
seq 1591 processed with 0
seq 1591 done with 0
worker[190] exit
worker[190] unexpectedly returned with status 0x0100

failed while handling 'devices/pci.....1c.4/000:02:00.2'



I know that '02:00.2' is the Integrated Lights out 3, which i have disabled.

/lib/udev/rules.d/90-hal.rules has as expected the following line:
RUN+="socket:@/org/freedesktop/hal/udev_event

Each time i reboot with acpi=off i get different output at the place where boot stops. I have attached a few images.

If i boot up with acpi=off in the boot parameters there is nowhere that the udev log stuff is being written, dmesg has pretty much nothing in it.

Any more ideas anyone??

Thanks for helping!

Re: Trouble with Mageia1 on HP DL320 and DL360

PostPosted: Mar 22nd, '12, 16:12
by linuxero
Still looking for a udev guru..I'll try contact the developper..

But do you agree that all points out to udev?

Re: Trouble with Mageia1 on HP DL320 and DL360

PostPosted: Mar 22nd, '12, 18:11
by wintpe
going back to the distros i install RH6.2/5.8 on these hosts (or very similar) DL380g6

try centos 5.6 as its an older kernel.

this should work no problem, if thats still giving issues would suspect hw

regards peter