Jump to content

hd might be going bad?


jlc
 Share

Recommended Posts

ok, i'm working along and everything is fine, then xmms started skipping on me. then a game started going crazy on me? So i jumped over to "ctrl + alt + f1" and see hdb errors all over the place, here is a snippet:

 

Feb 12 20:46:17 neo kernel: hdb: drive not ready for command
Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:17 neo kernel:
Feb 12 20:46:17 neo kernel: hdb: drive not ready for command
Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { }

 

So i go back to X, pretty soon when i type a command in a termal, i get "bus error" that is it, no commands work. So I reboot, mind you that is the finger reboot, cause "no commands work".

 

Box is comeing up, "hit Y to check file system" well that didn't work said i need to manually do it. So I fsck my slices and / comes up with some inode errors and YES, fix it. Anyway, so here I am typeing from the box again, what do you think should i do some more testing on my hd or move along in my happy world? Anyone know more about "smart" kinda monitor for hd's?

 

Here is some more of /var/log/messages if your bored with your life and want to read ;)

 

 

Feb 12 20:46:17 neo kernel: hdb: drive not ready for command
Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:17 neo kernel:
Feb 12 20:46:17 neo kernel: hdb: drive not ready for command
Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:17 neo kernel:
Feb 12 20:46:17 neo kernel: hdb: drive not ready for command
Feb 12 20:46:17 neo kernel: ide0: reset: success
Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:17 neo kernel:
Feb 12 20:46:17 neo kernel: end_request: I/O error, dev 03:42 (hdb), sector 139007648
Feb 12 20:46:17 neo kernel: hdb: drive not ready for command
Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:17 neo kernel:
Feb 12 20:46:17 neo kernel: hdb: drive not ready for command
Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:17 neo kernel:
Feb 12 20:46:17 neo kernel: hdb: drive not ready for command
Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:17 neo kernel:
Feb 12 20:46:27 neo kernel: hdb: drive not ready for command
Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:27 neo kernel:
Feb 12 20:46:27 neo kernel: hdb: drive not ready for command
Feb 12 20:46:27 neo kernel: ide0: reset: success
Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:27 neo kernel:
Feb 12 20:46:27 neo kernel: hdb: drive not ready for command
Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:27 neo kernel:
Feb 12 20:46:27 neo kernel: hdb: drive not ready for command
Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:27 neo kernel:
Feb 12 20:46:27 neo kernel: hdb: drive not ready for command
Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:27 neo kernel:
Feb 12 20:46:27 neo kernel: hdb: drive not ready for command
Feb 12 20:46:27 neo kernel: ide0: reset: success
Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { }
Feb 12 20:46:27 neo kernel:
Feb 12 20:46:27 neo kernel: end_request: I/O error, dev 03:42 (hdb), sector 3377768
Feb 12 20:46:27 neo kernel: hdb: drive not ready for command
Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { }

 

 

:juggle:

Link to comment
Share on other sites

$ sudo /usr/sbin/smartctl -i /dev/hdb
smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     IC35L080AVVA07-0
Serial Number:    VNC400A4L1AAMA
Firmware Version: VA4OA52A
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   5
ATA Standard is:  ATA/ATAPI-5 T13 1321D revision 1
Local Time is:    Thu Feb 12 22:21:18 2004 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Link to comment
Share on other sites

$ sudo /usr/sbin/smartctl -Hc /dev/hdb
smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity was
                                       never started.
                                       Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed                                        without error or no self-test has ever
                                       been run.
Total time to complete Offline
data collection:                 (2288) seconds.
Offline data collection
capabilities:                    (0x1b) SMART execute Offline immediate.
                                       Auto Offline data collection on/off support.
                                       Suspend Offline collection upon new
                                       command.
                                       Offline surface scan supported.
                                       Self-test supported.
                                       No Conveyance Self-test supported.
                                       No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                       power-saving mode.
                                       Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                       No General Purpose Logging support.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (  38) minutes.

 

 

Well, I guess the "PASSED" means it is ok?

Link to comment
Share on other sites

$ sudo /usr/sbin/smartctl -l error /dev/hdb
smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

 

Well before I started this, I noticed this drive wasn't in the smart file so this prolly doesn't matter.

Link to comment
Share on other sites

Don't mind me, this is a nice safe "notpad" for me :lol:

 

[justin@neo log]$ sudo /usr/sbin/smartctl -l selftest /dev/hdb
smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
No self-tests have been logged.  [Use the smartctl -t option to run these.]


[justin@neo log]$ sudo /usr/sbin/smartctl -t short /dev/hdb
smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 1 minutes for test to complete.
Test will complete after Thu Feb 12 22:28:07 2004

Use smartctl -X to abort test.

Link to comment
Share on other sites

[justin@neo log]$ sudo /usr/sbin/smartctl -l selftest /dev/hdb
smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     13779         -

 

Looks good so far.

Link to comment
Share on other sites

[justin@neo log]$ sudo /usr/sbin/smartctl -t long /dev/hdb
smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 38 minutes for test to complete.
Test will complete after Thu Feb 12 23:07:40 2004

Use smartctl -X to abort test.

 

38 minutes, this is going to be along wait. Wonder if i can still play "et" while the test is moving on. :unsure:

Link to comment
Share on other sites

cybrjackle.....

Have you rebooted this year????

 

yeah, i know you shouldn't have to and all......

 

but if its been up and running for 6 months then a power down for an hour might let things cool down and stuff!

 

It could also be the drive controller - probably built into the mobo

... or lots of things.....

 

But ....

Ive noticed some of my long uptime boxes develop random errors and rereading the bios and boot seems to make them happy again....

Link to comment
Share on other sites

Well when i ran the long test, it bombed out, reboots do nothing but craze up my inodes. This isn't the box that stays up along time, gets rebooted often with differnet kernels.

 

So far its not looking good dr. 10-15 minutes after fsck the / everytime i reboot, the drive goes crazy.

 

I think she's a gonner, I think its a maxtor and i'll look for the diag disk I ran on the last one.

 

:wall:

Link to comment
Share on other sites

cybrjackle .....

You know this, Im just reminding you.

(Ix once told me something I already knew about reinstalls saving your social life.... he was very right!!! I just needed telling it again)

 

Stop messing.... go and get another disk before you loose your data.

Get the other disk in, make a nice install and copy your data and stuff across beforeit dies on ya!!!!

 

Then you can mess with the other one.... if you fix it great .... you'll have a spare disk and some room to play with filesystems and stuff.

 

don't take this the wrong way!!! I just think sometimes it helps when someone else says it!!!!

 

Good luck!!

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

×
×
  • Create New...