Help - Search - Members - Calendar
Full Version: Odd 'lock ups'?
MandrivaUsers.org > Advanced Topics > Command Line, Kernel and Programming
Urza9814
Ok, so every so often, my computer appears to lock up. It's still running, because I'm still getting emails that backups were completed, I'm still on AIM, and externally everything seems fine. But the screens are black and getting no signal from the computer, they keyboard won't work - even the numlock and capslock lights won't toggle. Usually I notice it within a few hours; it happens every couple days. Last time this happened, I tried to login to webmin, which should have been running and unblocked by my firewall, but no luck. SSH was being blocked. The only thing I'm seeing in the system logs before the crash is 'crond (mail)' messages, so I'm halting crond. My next attempt will be disabling my KDE screen saver (though I've already tried changing it to 'none')

Basically, what I want to know is, will halting crond cause any problems (other than, I'm guessing, stopping my nightly backups from running)? And does anyone have any other ideas what the problem may be and/or how I could figure it out? I'm running Mandriva 2008 Spring, getting all the updates, and this problem has pretty much been occurring since installing...but that install also coincided with some massive hardware updates, so that's more likely the problem.
David Batson
Are you using a USB keyboard? If so, perhaps an energy saving feature for your USB ports is disabling it. This feature could be set in the BIOS or possibly Mandriva (not sure).

The screen could be black because of the display energy saving feature kicking in after a period of inactivity. Check your display settings and KPowerSave.

One other thing. If this is a desktop PC, make sure all your cards are fully seated. I have had problems with cards unseating themselves after a period of time.
daniewicz
You mention "massive hardware updates." As David said, try re-seating your cards. I would also clear the CMOS on the motherboard.

Urza9814
No, PS/2 keyboard. Good 'ol IBM Model M smile.gif

I never did really consider it could be a power saving feature...it seems to happen quite randomly though. But I'll look into those...along with re-seating my cards.
tyme
disable compiz, metisse, etc.
if you are using a proprietary 3d driver, change back to the Xorg version (i.e. nv for nvidia).
completely disable screensaver.

see if problem persists.
ianw1974
Disable all power-saving features too in case this is causing it. I've had problems with a system going into power-save mode and from time to time I could never get it to wake up or even connect to it across the network. It always required a hard reset to get it going again.

I did fix it in the end, just can't remember what it was but I'm sure it was something to do with power management.
Urza9814
Ok...um...where do I change the power saving options? Kpowersave is set to leave everything running, even though it isn't running. And I can't seem to find any power saving options in MCC or in KDECC.
ianw1974
You'll have options for the monitor for it to power off, etc, I don't use KDE and haven't for a while but I know you can stop it spinning down your hard disks, stop your monitor switching off, stop your machine going to standby, hibernate etc, etc. I've usually gone through all these in gnome as this is what I'm currently using.
Urza9814
Ok, so it happened again last night. My computer was only idle for about 8 hours, while the day before I left it idle for over 20 to see if it was a timed thing. Doesn't look like it, so I would think that rules out some kind of power saving setting that I can't find (as far as I can tell, they're all off)
Screensaver is also turned off, along with compiz and stuff...never used it to begin with. I am using the proprietary Nvidia drivers, because I can't get dual monitors working on the generic ones...but I suppose I'll try those next.
Here's all the messages in the syslog between when I last used it and when it died too, though I can't see anything wrong.

CODE
Aug 16 15:20:14 localhost gconfd (urza9814-17145): starting (version 2.22.0), pid 17145 user 'urza9814'
Aug 16 15:20:14 localhost gconfd (urza9814-17145): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0
Aug 16 15:20:14 localhost gconfd (urza9814-17145): Resolved address "xml:readonly:/etc/gconf/gconf.xml.local-mandatory" to a read-only configuration source at position 1
Aug 16 15:20:14 localhost gconfd (urza9814-17145): Resolved address "xml:readwrite:/home/urza9814/.gconf" to a writable configuration source at position 2
Aug 16 15:20:14 localhost gconfd (urza9814-17145): Resolved address "xml:readonly:/etc/gconf/gconf.xml.local-defaults" to a read-only configuration source at position 3
Aug 16 15:20:14 localhost gconfd (urza9814-17145): Resolved address "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration source at position 4
Aug 16 21:06:49 localhost kernel: crashreporter[22907]: segfault at b707dda0 eip b707dda0 esp bf908a1c error 14
Aug 16 22:22:32 localhost kernel: usb 2-6: USB disconnect, address 6
Aug 16 22:22:32 localhost kernel: ehci_hcd 0000:00:13.2: qh f6d28100 (#01) state 4
Aug 18 07:38:25 localhost syslogd 1.4.2: restart.
ianw1974
You can definitely rule out power saving by just simply adding:

CODE
acpi=off


to the end of the kernel line in /boot/grub/menu.lst
SilverSurfer60
Aug 16 21:06:49 localhost kernel: crashreporter[22907]: segfault at b707dda0 eip b707dda0 esp bf908a1c error 14
Aug 16 22:22:32 localhost kernel: usb 2-6: USB disconnect, address 6
Aug 16 22:22:32 localhost kernel: ehci_hcd 0000:00:13.2: qh f6d28100 (#01) state 4

Aug 18 07:38:25 localhost syslogd 1.4.2: restart.

There is your fault report. What have you running from USB? I used to have a USB extender that did exactly what you are experiencing and it turned out to be the extender itself. Its now a paper weight.
Urza9814
Ah! Hmm...my mouse (a basic logitech wireless - never had a problem with it before), my printer (Brother HL-2040 laser), and...hey, there was a third wire back there that I couldn't figure out. Turns out it's a short extender I forgot about. Not sure if that's the problem though or not...because I _think_ I was having the problem before plugging it in. But we'll see. Thanks smile.gif

If it is the USB extender...well, all I can say to that is WTF.
SilverSurfer60
Well if it turns out to be the extender you will also have a novel paper weight. Let us know the outcome.
ianw1974
Aye keep us posted, I'll have to bear this USB thing in mind if I use one of these extenders.
Urza9814
Well, I removed the extender, and it's now throwing the USB error anymore, but it's still locking up. Has twice now, with the following errors:

CODE
Aug 18 11:33:39 localhost kernel: crashreporter[4818]: segfault at b707eda0 eip b707eda0 esp bf92523c error 14
CODE
Aug 20 02:39:19 localhost kernel: pulseaudio[6543]: segfault at 00003348 eip b7ece5f8 esp b75e2de4 error 4


Just noticing however, they all seem to occur shortly after Mandriva checks for updates. I'll try disabling that and see what happens.
tux99
QUOTE (Urza9814 @ Aug 20 2008, 06:32 PM) *
Well, I removed the extender, and it's now throwing the USB error anymore, but it's still locking up. Has twice now, with the following errors:

CODE
Aug 18 11:33:39 localhost kernel: crashreporter[4818]: segfault at b707eda0 eip b707eda0 esp bf92523c error 14
CODE
Aug 20 02:39:19 localhost kernel: pulseaudio[6543]: segfault at 00003348 eip b7ece5f8 esp b75e2de4 error 4


Just noticing however, they all seem to occur shortly after Mandriva checks for updates. I'll try disabling that and see what happens.


I don't know if you still have this problem, but if yes. I would recommend you to run memtest, download the following ISO image, burn it to CD and then boot from it and let the memtest run for at least 24 hours:

http://www.memtest86.com/memtest86-3.4a.iso.zip

Random segfaults and lockup like you are seeing are often caused by memory errors (could then be either defective memory or wrong bios timing/voltage settings).
If memtest finds errors then run it again with just a single DIMM inserted (if you have more than one) in turns, until you find the defective one (if they all give errors then the problem is more likely bios settings or even cpu or mobo problems).
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.