Jump to content
uralmasha

Sudden persistent kernel panics [kind of resolved]

Recommended Posts

All of a sudden my system hoses up completely, sometimes right during boot, sometimes a bit later, but always pretty soon. The latest thing I did to the system was update to Thac's X11 (2 days ago) and regular urpmi --auto-select which installed 3 packages from the official repo.

 

Well, it looks like an error in the hardware, apart from the fact that Ubuntu on the same machine, same harddrive seems to run OK.

 

I eventually reinstalled Mandriva-2006, as neither rescue nor "upgrade" from the disk did not really help. This is the last screen I got, after a clean install with formatting the root partition:

 

General protection fault : 0000 [1] SMP
CPU 1
Modules linked in: usb_storage usbmouse usbhid usbkbd usbhlp ehci_hcd  ohci_hcd usbcore evdev ext3 jbd sd_mod sata_nv libata scsi_mod
Pid: 0, comm: swapper Not Tsinted 2.6.12-12mdksmp
RIP: 0010:[ffffffff80194da7] <ffffffff8094da7> {tasklet_action+55}
RSP: 0018:fffff EFLAGS: 00010286
<same kind of  hexadecimal numbers with RBX RCX RDX , looks like register content> ... etc prefixes, in totalabout 24 

Further there is stack and Call tracer information,

Code: 48 8b ... etc 

<0>Kernel panic: not synching: Aiee, killing interrupt handler!

 

I ran fsck.ext3 -cv /dev/sda5 on the root partition and it seems to be in order, there were no errors, only a message that the "file system was modified" -- I suppose this is the concequence of a number of hard resets I had to apply to the PC :-(

 

Anyways, what next? Has anyone any faintest idea what this might be, how to fix this and -- more important -- how to avoid this in the future?

 

Please, help if you have any idea!

Edited by uralmasha

Share this post


Link to post
Share on other sites

Just something worth trying, since you've tried formatting when installing. I presume during the installation when booted from the CD.

 

So, I would suggest booting to ubuntu if it's running fine, and then you can issue the commands manually for the /dev/sda5 partition that has Mandriva 2006 on it.

 

mke2fs -j /dev/sda5

 

and see if it comes back with any errors. If not, then you can create a mountpoint for it, and then mount it, so:

 

mkdir /mnt/mandriva
mount /dev/sda5 /mnt/mandriva

 

and see if any errors come back. If that proves successful, then I'd be tempted to think that the installer was having problems somehow because of a screwed partition.

Share this post


Link to post
Share on other sites

Check your /var/log/messages file. It will tell you if there is a hardware problem on your box when using Mandriva.

Share this post


Link to post
Share on other sites

thank you guys for a quick reaction!

 

Arctic,

/var/log/messages is just gone ! :oops:

However, there were some errors reported in there, related to "bogus address" in modules nv_sata and forcedeth . I'll post them here, as I have an extract in my knotes.

 

I had checked /var/log/kernel/errors , and there were nothing unusual. It had always complained about hid_core.c failing, something related to USB, and not being able to allocate region 3 of device 00:00:02:blah:blah , I don't know what it is, ubuntu complained about the two things, too... Since it worked, I left it as is.

 

Some time ago it reported that it would not find System.map "the kernel modules will not be loaded", but this had been happening for a while (like a few days), and the modules did load, in fact. I just did not have chance to see it before yesterday.

 

ianw1974,

 

I re-created partition manually, it did not complain. Now I am hesitating whether I should try Windows (for troubleshooting) or another go at mandriva...

Share this post


Link to post
Share on other sites

I'd go with Mandriva one more time and installed into the formatted /dev/sda5 that you have without the errors. I believe the installer will let you select this existing disk partition.

 

And see how you get on. Windows might be a bit tricky trying to get it in /dev/sda5 without upsetting the rest of the system!

Share this post


Link to post
Share on other sites

Yeah,

the first 1- GB of /dev/sda is left empty in case Windoze will be needed... However, it just told me it would need at least 1033Mb of space, some 20Mb more that I reserved...

So, I gave Mandriva another chance.

Share this post


Link to post
Share on other sites

I am a bit wondering... Thacs packages can't have caused this as it seems to be a kernel related problem. Sometimes, kernels are not yet fully compatible with newer harddisks, especially if they are handled as /sdX devices, and need patching. I had this once, too. A new harddisk crashed for no apparent reason until I found out that the problems were caused by a system instability, using ext3 with kernel 2.6.14. I discussed this with developers and they took care of the problem and using kernel 2.6.15 with ReiserFS solved the instability problem for me.

I don't know how old/new your hardware is, but there is always a chance that the kernel refuses to play nicely with your harddisk.

 

PS: I just saw that you are using the x86-64 version of Mandriva. For testing purposes it would be good to know if you have the same problem when using the 32 bit version of Mandy. Sometimes, problems occur only when using 64 bit optimized distros.

Share this post


Link to post
Share on other sites

Arctic,

I don't know if it has something to do with Thac, because this accident happened 2 days after I updated to his -28 build. I suppose it would have manifested itself much earlier than yesterday.

 

I have been using this PC since August, it is reasonably new. Started with LE edition (very spartan on x86-64), then cooker and now upgraded to 2006. I can't complain at stability, if understand this term right. There is a dead kded now and then, but that's it. I had, however, some deeper troubles once in the fall, when my ethernet swithched with iee1394 interface, all of a sudden, too.

 

I check /var/log/messages sporadically, and kernel/errors, too. Nothing fancy there. The only thing kernel complained about was those 2 errors I mentioned above.

 

I have compiled and used "vanilla" kernels 2.6.14 (to solve the timer problem I had with 2.6.12, and I'm likely to do it again :-( ) and then 2.6.15 , using mandriva's .config as an oldconfig and a tutorial. This is actually the reason I in the end decided to wipe the partition: I thought Mandy discovered some bad code in 2.6.15 it has not happened to see before (there has been some intensive usage of database and apache just hours before the clinic death of my PC).

 

I don't have a spare partition just now, apart from Ubuntu. If this installation does not screw, I'd prefer not to turn it into as 32-bit one. Otherwise I'll report here when I have the 32-bit install.

 

BTW, would ReiserFS on root partition / be enough or all partitions would need to be re-formatted/converted?

Share this post


Link to post
Share on other sites

reiserfs would be fine for /dev/sda5, you don't need to convert all of them, they can exist as they do. Maybe you'll have more luck with reiserfs. Make sure it's compiled into the kernel of course, if you do compile a new one :P

 

I just stick to normal standard Mandriva kernels just in case. Latest with 2006 is 2.6.12.14. But you have to upgrade to it, from install it's 2.6.12.12.

Share this post


Link to post
Share on other sites

Can you boot with ALL USB and PCMCIA devices unplugged?

If you can then it's a clear hotplug problem, best addressed by pushing hotplug in the bin.

Edited by scarecrow

Share this post


Link to post
Share on other sites

Uhm. Unplugging usb connectors from the motherboard? Pull the TV card?

Is that what you mean, scarecrow?

 

btw,

here is the error from /var/log/messages from previous incarnation of my mandriva, that I accidentaly saved:

 

Jan 18 20:00:25 kokoc kernel: [<ffffffff88031000>] (nv_interrupt+0x0/0xe0 [sata_nv])
Jan 18 20:00:25 kokoc kernel: [<ffffffff881ef9a0>] (nv_nic_irq+0x0/0x4f0 [forcedeth])
Jan 18 20:00:28 kokoc kernel: irq event 217: bogus return value 7ffbbe39
Jan 18 20:00:28 kokoc kernel: handlers:
Jan 18 20:00:28 kokoc kernel: [<ffffffff88031000>] (nv_interrupt+0x0/0xe0 [sata_nv])
Jan 18 20:00:28 kokoc kernel: [<ffffffff881ef9a0>] (nv_nic_irq+0x0/0x4f0 [forcedeth])
Jan 18 20:00:28 kokoc kernel: irq event 217: bogus return value 2c1d221

 

So far it runs, no obvious errors in messages.

Share this post


Link to post
Share on other sites

Ok. The new installation did not help. I re-partitioned /dev/sda to increase the size of /dev/sda1 (empty, added 200M) and /dev/sda5 (formatted as ReiserFS), where I installed Madriva 2006. Eventually I got the same result as before: hangs at startup or later.

 

I would not like to open the case yet, it is sealed by the shop,. Maybe later when I have no other option.

 

Now the question: are there any live-CDs with diagnostic tools for memory/disc surface? Sigh.

Edited by uralmasha

Share this post


Link to post
Share on other sites

I'm sure a knoppix live cd would have one. What I would suggest doing is running memtest to see if there are any problems with your hardware.

 

I'm not sure if this just test memory, or more. I had problems with Mandriva crashing on my system at home, but Windows was perfectly fine. All I did to solve it in the end when memtest was reporting errors, was to choose the "Load Optimised Defaults" in the BIOS, and the problems then went away. Obviously for me, my BIOS was configured incorrectly.

 

I'm not saying that this is your problem, but it might be. Maybe try the load optimised option in the BIOS if you have it, and then try with Mandriva and see if it boots etc.

Share this post


Link to post
Share on other sites

Thank you, ianw1974.

 

I'll give memtest a try... it's gonna take a long time. I would say, kernel compilation is already a good a memtest, though.

 

Anyway, I am not entirely sure about BIOS, as I have not changed anything in there recently. Just in case I will load those "optimised defaults" (seems we have similar bios versions).

 

I'll report backk here when it's done.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...