Jump to content

Why would a filesystem suddenly become read-only?


ffrr
 Share

Recommended Posts

I have had this happen a few times. Today when I logged in (the computer runs 24 hours a day) I found that mysql had stopped running. I soon discovered this was because the whole partition where I had /var had become readonly. I had to reboot. The reboot process found errors on that partition, and fixed them, so it is up and running again.

 

I've had this happen to a data partition as well. In fact I find the filesystem (I'm using ext3) under Mandriva quite flaky. Is it because I leave it on all the time? I do that because I run Mythtv on it.

 

Anyway, why does it do this (become mounted readonly)? Is it because of the errors, and if so, why is it getting these errors. Should there be a log somewhere about what happened?

 

This is the second hard disk I have tried (both were brand new 80 GB SATA drives). I also have a large 250GB SATA drive that I store the mythtv data on. It played up when I had XFS on it, but seems more stable now I have ext3 on it. Is linux not very stable on SATA drives perhaps?

 

[moved from Software by spinynorman]

Link to comment
Share on other sites

It sounds like your memory (the machines' that is) is getting flakey.

I think you need to run memtest for a day or so to be sure.

When faulty memory starts to corrupt files It seems to produce read only conditions but I do not know if this planned design for such situations or not.

 

Check the memory.

 

 

Cheers. John.

Link to comment
Share on other sites

Yep, sure enough, there are 2 stuck bits in the RAM, both down low, at about 31MB and 81MB. I have the badRAM= line that memtest generated, but no idea how to use it (and it seems to repeat itself so I'd like to sanity check it first). Is it supposed to be added to the kernel command line in lilo ? Anyway, still looking for doco on badRAM...

Link to comment
Share on other sites

Further developments. I borrowed a spare memory module from work to see if it was my RAM or motherboard causing the memory errors. Well, the surprise was that the one from work has an error in an entirely different place - at about 124 MB.

 

So I put mine back in, and slowed down the memory timings in the BIOS (it's an ABIT NF7S board and fairly configurable). One of the 2 errors went away. I have put a badram entry on the kernel command line to stop the remaining error causing problems. I may go back and restore the proper memory timings, and mask out the second error as well.

 

For now, we'll see what happens. I think I'll get a few memory sticks at work and see how many have errors. Maybe the cheap RAM being sold around town is not so good!!!

Link to comment
Share on other sites

Here's what happened (including my remount). Please help with any ideas what is happening....

 

EXT3-fs error (device sda10): ext3_free_blocks: Freeing blocks not in datazone - block = 53796864, count = 1

Aborting journal on device sda10.

EXT3-fs error (device sda10) in ext3_free_blocks_sb: Journal has aborted

EXT3-fs error (device sda10) in ext3_reserve_inode_write: Journal has aborted

EXT3-fs error (device sda10) in ext3_truncate: Journal has aborted

EXT3-fs error (device sda10) in ext3_reserve_inode_write: Journal has aborted

EXT3-fs error (device sda10) in ext3_orphan_del: Journal has aborted

EXT3-fs error (device sda10) in ext3_reserve_inode_write: Journal has aborted

EXT3-fs error (device sda10) in ext3_delete_inode: Journal has aborted

__journal_remove_journal_head: freeing b_committed_data

ext3_abort called.

EXT3-fs error (device sda10): ext3_journal_start_sb: Detected aborted journal

Remounting filesystem read-only

end_request: I/O error, dev fd0, sector 0

end_request: I/O error, dev fd0, sector 0

end_request: I/O error, dev fd0, sector 0

end_request: I/O error, dev fd0, sector 0

kjournald starting. Commit interval 5 seconds

EXT3-fs warning (device sda10): ext3_clear_journal_err: Filesystem error recorded from previous mount: IOfailure

EXT3-fs warning (device sda10): ext3_clear_journal_err: Marking fs in need of filesystem check.

EXT3-fs warning: mounting fs with errors, running e2fsck is recommended

EXT3 FS on sda10, internal journal

EXT3-fs: recovery complete.

EXT3-fs: mounted filesystem with ordered data mode.

Link to comment
Share on other sites

ffrr - there was no need to start a new thread, so I've merged it into this one.  :)

 

Ok, sorry.

 

Right then, to explain the above, what happened was, although I have excluded the badram, I still ran into a problem later that night. However, on further thought, maybe this part of the disk was damaged while I still had some bad ram. Does this indicate I should fsck all my partitions for a clean start now.

 

If so, what's the best way - boot off the install DVD and go to the repair prompt, then check all the partitions before they are mounted? fsck -f forces a check on supposedly clean filesystems doesn't it? Given the amount of keypresses I suffered while cleaning that one last night, i think fsck -f -a might be needed...

Link to comment
Share on other sites

  • 3 weeks later...
ffrr - there was no need to start a new thread, so I've merged it into this one.  :)

 

Ok, sorry.

 

Right then, to explain the above, what happened was, although I have excluded the badram, I still ran into a problem later that night. However, on further thought, maybe this part of the disk was damaged while I still had some bad ram. Does this indicate I should fsck all my partitions for a clean start now.

 

If so, what's the best way - boot off the install DVD and go to the repair prompt, then check all the partitions before they are mounted? fsck -f forces a check on supposedly clean filesystems doesn't it? Given the amount of keypresses I suffered while cleaning that one last night, i think fsck -f -a might be needed...

 

Hopefully a final followup on this. Even with new memory that tests OK, I was getting file system corruptions, and I think I have finally tracked it down.

 

My motherboard is an NF7-S from abit, and there is apparently a 'known problem' with the SIL 3112 Sata raid chip on it, especially when working with early Seagate drives. A new BIOS supplied a new parameter called 'Ext P2P' which can be set to various times, most in the order of 20 or 30 uS but one setting, recommended if disk problems still occur, is a whopping big 1ms. I have set it to this and it seems to be OK now, - touch wood. Strangely I have noticed no performance hit, and someone else said the difference was only 5% or so.

 

It'll be nice to have a stable system for a while... but not happy with abit. :angry:

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

×
×
  • Create New...