jlc Posted February 13, 2004 Report Share Posted February 13, 2004 ok, i'm working along and everything is fine, then xmms started skipping on me. then a game started going crazy on me? So i jumped over to "ctrl + alt + f1" and see hdb errors all over the place, here is a snippet: Feb 12 20:46:17 neo kernel: hdb: drive not ready for command Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:17 neo kernel: Feb 12 20:46:17 neo kernel: hdb: drive not ready for command Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { } So i go back to X, pretty soon when i type a command in a termal, i get "bus error" that is it, no commands work. So I reboot, mind you that is the finger reboot, cause "no commands work". Box is comeing up, "hit Y to check file system" well that didn't work said i need to manually do it. So I fsck my slices and / comes up with some inode errors and YES, fix it. Anyway, so here I am typeing from the box again, what do you think should i do some more testing on my hd or move along in my happy world? Anyone know more about "smart" kinda monitor for hd's? Here is some more of /var/log/messages if your bored with your life and want to read ;) Feb 12 20:46:17 neo kernel: hdb: drive not ready for command Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:17 neo kernel: Feb 12 20:46:17 neo kernel: hdb: drive not ready for command Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:17 neo kernel: Feb 12 20:46:17 neo kernel: hdb: drive not ready for command Feb 12 20:46:17 neo kernel: ide0: reset: success Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:17 neo kernel: Feb 12 20:46:17 neo kernel: end_request: I/O error, dev 03:42 (hdb), sector 139007648 Feb 12 20:46:17 neo kernel: hdb: drive not ready for command Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:17 neo kernel: Feb 12 20:46:17 neo kernel: hdb: drive not ready for command Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:17 neo kernel: Feb 12 20:46:17 neo kernel: hdb: drive not ready for command Feb 12 20:46:17 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:17 neo kernel: Feb 12 20:46:27 neo kernel: hdb: drive not ready for command Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:27 neo kernel: Feb 12 20:46:27 neo kernel: hdb: drive not ready for command Feb 12 20:46:27 neo kernel: ide0: reset: success Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:27 neo kernel: Feb 12 20:46:27 neo kernel: hdb: drive not ready for command Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:27 neo kernel: Feb 12 20:46:27 neo kernel: hdb: drive not ready for command Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:27 neo kernel: Feb 12 20:46:27 neo kernel: hdb: drive not ready for command Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:27 neo kernel: Feb 12 20:46:27 neo kernel: hdb: drive not ready for command Feb 12 20:46:27 neo kernel: ide0: reset: success Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { } Feb 12 20:46:27 neo kernel: Feb 12 20:46:27 neo kernel: end_request: I/O error, dev 03:42 (hdb), sector 3377768 Feb 12 20:46:27 neo kernel: hdb: drive not ready for command Feb 12 20:46:27 neo kernel: hdb: status error: status=0x00 { } Quote Link to comment Share on other sites More sharing options...
jlc Posted February 13, 2004 Author Report Share Posted February 13, 2004 $ sudo /usr/sbin/smartctl -i /dev/hdb smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: IC35L080AVVA07-0 Serial Number: VNC400A4L1AAMA Firmware Version: VA4OA52A Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 5 ATA Standard is: ATA/ATAPI-5 T13 1321D revision 1 Local Time is: Thu Feb 12 22:21:18 2004 CST SMART support is: Available - device has SMART capability. SMART support is: Enabled Quote Link to comment Share on other sites More sharing options...
jlc Posted February 13, 2004 Author Report Share Posted February 13, 2004 $ sudo /usr/sbin/smartctl -Hc /dev/hdb smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (2288) seconds. Offline data collection capabilities: (0x1b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. No Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. No General Purpose Logging support. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 38) minutes. Well, I guess the "PASSED" means it is ok? Quote Link to comment Share on other sites More sharing options...
jlc Posted February 13, 2004 Author Report Share Posted February 13, 2004 $ sudo /usr/sbin/smartctl -l error /dev/hdb smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Error Log Version: 1 No Errors Logged Well before I started this, I noticed this drive wasn't in the smart file so this prolly doesn't matter. Quote Link to comment Share on other sites More sharing options...
jlc Posted February 13, 2004 Author Report Share Posted February 13, 2004 Don't mind me, this is a nice safe "notpad" for me :lol: [justin@neo log]$ sudo /usr/sbin/smartctl -l selftest /dev/hdb smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 No self-tests have been logged. [Use the smartctl -t option to run these.] [justin@neo log]$ sudo /usr/sbin/smartctl -t short /dev/hdb smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Short self-test routine immediately in off-line mode". Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 1 minutes for test to complete. Test will complete after Thu Feb 12 22:28:07 2004 Use smartctl -X to abort test. Quote Link to comment Share on other sites More sharing options...
jlc Posted February 13, 2004 Author Report Share Posted February 13, 2004 [justin@neo log]$ sudo /usr/sbin/smartctl -l selftest /dev/hdb smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 13779 - Looks good so far. Quote Link to comment Share on other sites More sharing options...
jlc Posted February 13, 2004 Author Report Share Posted February 13, 2004 [justin@neo log]$ sudo /usr/sbin/smartctl -t long /dev/hdb smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 38 minutes for test to complete. Test will complete after Thu Feb 12 23:07:40 2004 Use smartctl -X to abort test. 38 minutes, this is going to be along wait. Wonder if i can still play "et" while the test is moving on. :unsure: Quote Link to comment Share on other sites More sharing options...
bvc Posted February 13, 2004 Report Share Posted February 13, 2004 Don't mind me, this is a nice safe "notpad" for me :lol: :lol: rofl ...and??? Well??? don't leave us hangin :unsure: Quote Link to comment Share on other sites More sharing options...
pmpatrick Posted February 13, 2004 Report Share Posted February 13, 2004 If you really want to know, download the hard drive manufacturer's hard drive diagnostic utilities and run them. All the majors have them available on their websites. There usually the best tool for the job. Quote Link to comment Share on other sites More sharing options...
dave_hallett Posted February 13, 2004 Report Share Posted February 13, 2004 I would add: if SMART says your drive has a problem, there's probably a problem. But just because SMART says no problem, doesn't mean it's necessarily OK. There are problems that SMART doesn't catch, IIRC. Quote Link to comment Share on other sites More sharing options...
Qchem Posted February 13, 2004 Report Share Posted February 13, 2004 It could also be the drive controller - probably built into the mobo Quote Link to comment Share on other sites More sharing options...
Gowator Posted February 13, 2004 Report Share Posted February 13, 2004 cybrjackle..... Have you rebooted this year???? yeah, i know you shouldn't have to and all...... but if its been up and running for 6 months then a power down for an hour might let things cool down and stuff! It could also be the drive controller - probably built into the mobo ... or lots of things..... But .... Ive noticed some of my long uptime boxes develop random errors and rereading the bios and boot seems to make them happy again.... Quote Link to comment Share on other sites More sharing options...
jlc Posted February 13, 2004 Author Report Share Posted February 13, 2004 Well when i ran the long test, it bombed out, reboots do nothing but craze up my inodes. This isn't the box that stays up along time, gets rebooted often with differnet kernels. So far its not looking good dr. 10-15 minutes after fsck the / everytime i reboot, the drive goes crazy. I think she's a gonner, I think its a maxtor and i'll look for the diag disk I ran on the last one. Quote Link to comment Share on other sites More sharing options...
revo Posted February 13, 2004 Report Share Posted February 13, 2004 :( bummer Quote Link to comment Share on other sites More sharing options...
Gowator Posted February 13, 2004 Report Share Posted February 13, 2004 cybrjackle ..... You know this, Im just reminding you. (Ix once told me something I already knew about reinstalls saving your social life.... he was very right!!! I just needed telling it again) Stop messing.... go and get another disk before you loose your data. Get the other disk in, make a nice install and copy your data and stuff across beforeit dies on ya!!!! Then you can mess with the other one.... if you fix it great .... you'll have a spare disk and some room to play with filesystems and stuff. don't take this the wrong way!!! I just think sometimes it helps when someone else says it!!!! Good luck!! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.