Ixthusdan Posted January 21, 2006 Report Share Posted January 21, 2006 Recently I had someone on another board question my logic on a computer problem (imagine somebody questioning my logic! :lol:) and it gave me an idea. I have a hypothetical situation. Since it is hypothetical, there is no real solution. But what would be the troubleshooting path? I think it would be a good discussion because, as most of you know, computer fixin' is not a fine science! So here it is: There is a linux box with an ATI 7000 card. Sometimes, it will just freeze and lock up. The keyboard is unresponsive and cannot do anything on the machine. But, one can ssh into it and reboot the machine to fix the problem. For this reason, it seems that if the x server crashes, the whole thing is taken out, just like winows. What might be wrong? Is linux like windows? :lol: Rather than say what I said, I want to hear what you all think. Don't ask for anymore details becausae these are all that were offered! Speculate city! Quote Link to comment Share on other sites More sharing options...
jboy Posted January 23, 2006 Report Share Posted January 23, 2006 Some quick thoughts: First: Per excellent IX advice from a past message: "Preparation is the Key." Make a list of the various steps and options you are going to take to try to identify the problem and solve it. Plan your problem-solving action steps. Second: Keep a detailed log of each trouble-shooting thing you try. Your notes may prove handy later on. Action Steps: 1. Read Tom Berger's FAQ: What do I do when my system stops responding?. Also, the article on the Magic SysRq Key. 2. Check logs -> /var/log/messages, Xorg.0.log, etc. Any hints? 3. Run memory checks. 4. Have all upgrades/bug fixes/security fixes etc. been applied? 5. Test hypothesis that the problem is X-related, demonstrate that problem does not occur when operating solely in command line mode. Boot to runlevel 3, operate machine for a day or so in this mode. Make sure you use network - use links or lynx for Internet, ssh into and out, rsync, etc. If no freeze problems, then the working hypothesis that it's X-related is looking good. 6. If machine froze in CLI-only mode, open box and check all data and power connections. Reseat boards, etc. 7. Is it an X-problem or a DM problems. Try some different DMs to see if problem is specific to DM you were using. 8. If machine stable in CLI-only mode and problem does not seem related to DM, try a different video driver. Try VESA as last resort. 9. If different video driver works without freeze re-occuring, either go with that or try playing around with the X config options on the (presumed) ATI driver the owner was using originally. Quote Link to comment Share on other sites More sharing options...
paul Posted January 23, 2006 Report Share Posted January 23, 2006 oddly enough I've had this happen to me. Problem was I had papers stacked on my box. a couple of those papers had slipped down the back, and inside the main power supply fan. The fan had then ceased up, and I could get a good 7-8 hours of use before it would freeze up, unless I was playing games or running seti. but I could always ssh into it. New case/power supple .. and the problem went away. Quote Link to comment Share on other sites More sharing options...
Ixthusdan Posted January 23, 2006 Author Report Share Posted January 23, 2006 Wow. Now I had never thought of the power supply on this one! jboy, thorough answer! Quote Link to comment Share on other sites More sharing options...
Gowator Posted January 23, 2006 Report Share Posted January 23, 2006 You suspect the graphics card so why not just go with that If you can ssh in then just kill X ... if you can then its either a HW or SW fault and probably in the graphics card so I d swap graphics card and check if its not the graphcis card it might be a power prob. I just changed a PSU in my GF's machine which was not delivering constant power and this fixed it... prior to that I had lame dying every time she tried to rip a mp3 and the problem wass the CPU drawing too much power..when using the mmx part. Quote Link to comment Share on other sites More sharing options...
Ixthusdan Posted January 23, 2006 Author Report Share Posted January 23, 2006 Actually, in over all hardware problems, the power supply is becoming more and the more culprit than say 5 years ago. Even new machines by the big manufacturers are having this problem. (Dell) Quote Link to comment Share on other sites More sharing options...
ianw1974 Posted January 23, 2006 Report Share Posted January 23, 2006 My Dad has six Dell's in his office, and two PSU's have failed recently, and one more is likely to go. These aren't exactly old machine, about two years old I would say. It was odd as well, because one was referring to battery problems, which we thought the system board battery was going. As soon as the PSU died, we replaced the PSU, and no more errors asking us to replace system board battery!! Quote Link to comment Share on other sites More sharing options...
chalex20 Posted January 23, 2006 Report Share Posted January 23, 2006 I had a similar problem, although with an Nvidia card. Occasional hangup here and there, most of the times with a possibility to ssh in. It was actually a memory problem - one of the DIMMs was faulty. Replaced it - and no hangups since. There, of course, are problems here and there ( I'm pretty heavy tweaker with strong inclination to break things sometimes), but hangups - none. So, running memtest86 for some day or two should be actually one of the items in the troubleshooting list. Quote Link to comment Share on other sites More sharing options...
Gowator Posted January 23, 2006 Report Share Posted January 23, 2006 Actually, in over all hardware problems, the power supply is becoming more and the more culprit than say 5 years ago. Even new machines by the big manufacturers are having this problem. (Dell) <{POST_SNAPBACK}> The problems are simpy increasing as power requirements increase, as a PSU gets more connections .. AT -> ATX -> P4 Quote Link to comment Share on other sites More sharing options...
AussieJohn Posted January 24, 2006 Report Share Posted January 24, 2006 Following on from Gowaters excellent comment, this is why I always use the largest rated affordable power supply possible. It is not a waste of resource to have a supply greatly in excess of what is routinely consumed by your computer because it means that every component in the supply is under-used and therefore not stressed. This stress is mostly from heat. Not only will your power supply components run cooler but the voltage regulation components are more stable and better able to handle sudden load changes whether short term or long term, i.e.smoother regulation. NOTE: (before the nitpickers dive in) A bigger power supply does not mean less overal heat is produced (this is determined by the load that the Computer draws. What is does mean is that components in large power supplys components are designed to handle higher loads and temperatures than components in smaller power supplies. In the case of components in large power supplys the components are cruising wheras the components in smaller PWs are running close to their limit. Have you ever noticed how many of the power supplys in old computers are still running today???. This is because electronic engineers in the past allowed for up to a 50% excess margin, and often 75%, on their design usage. ( My computer could get by with a 300watt PS but I have a top quality 450watt unit in my machine.......home built). Dells problem is showcasing that they are cutting costs by cutting corners and using the cheapest barely adequate power supplys.......that is not good engineering no matter how much they advertise about their quality !!!!. Jboy. Good post. Follows good technicians thinking. Great topic theme Lxthusdan. Cheers. John. Quote Link to comment Share on other sites More sharing options...
ianw1974 Posted January 24, 2006 Report Share Posted January 24, 2006 It's a shame, because they tend to put the PSU that equates to what hardware is in the machine. Then you start adding more and more stuff yourself, and then suddenly you can't turn the damn thing on! Need larger PSU to power what you've just added. So, an example for you. You have 300w PSU. Each item in the machine takes a draw on the system, so, eg: CPU - can be around 80-90w Memory - about 20w Video Card - can be around 80-90w CDROM/HDD - 30w each Sound Card - 20w to 30w perhaps So, if we assume one HDD and one CDROM drive, the total on the above system comes to 280/290w. I used 90w on both items CPU and Video Card. So, you add another hard disk, maybe a DVD Writer drive, and you've put yourself over what your PSU can handle. The problem is, when you turn the machine on, every single component fires up at the same time, so you get a hit on all wattage at the same time. If this exceeds the rating of your PSU, you can't turn the machine on. If you run AMD 64 chips, you should not be running a PSU less than 430w. Quote Link to comment Share on other sites More sharing options...
Ixthusdan Posted January 24, 2006 Author Report Share Posted January 24, 2006 AussieJohn has a great point about power supplies. They really should be over-stated when building a pc. I also believe that the quality of the power supplies has dwindled. Ironically, from a cost perspective, it's not all that much more to increase the size of the power suuply. Yet, margins are more important for the commercial producers than quality, and that is the unfortunate fact. Quote Link to comment Share on other sites More sharing options...
jboy Posted January 24, 2006 Report Share Posted January 24, 2006 (edited) This thread is an eye-opener and makes me think that I've been lucky. I've always built my own boxes, always had 2 HDs and 2 optical drives, always put in extra fans, and ran einstein@home and seti@home (which raised the CPU temperature 5 degrees F, so that's drawing more power), and never had a glitch. However, I never have had a real high end video card either. When I bought the cases, I paid attention to the PSU rating but never really had a clear idea of how much wattage I actually needed. I just always assumed I'd be ok with 350 - 400 watts. I sure will pay more attention to this in the future. I guess the mechanism here is that with the gui running, a spike in power requirements overloads the psu, cpu instructions get garbled, and gui crashes, but the kernel itself and network and other services somehow (by pure random luck?) do not crash. So power requirements then drop but with the kernel itself still functional, the ssh into the box works ok to shut the machine down. Is that what's going on? So for this machine in question that Ix gave us, can anyone think of a way to more directly test the PSU theory other than by process of elimination (memtests, other diagnostics, taxing the gui with power-hungry CPU and disk intensive tasks, swapping out the psu for a higher rated one, etc)? Edited January 24, 2006 by jboy Quote Link to comment Share on other sites More sharing options...
ianw1974 Posted January 24, 2006 Report Share Posted January 24, 2006 You'd prob have to rip out the PSU and try it separately, with some sort of electric tester. I should speak to my mate, he's good at this electronic sort of stuff. I'm sure he's got one of these tester gadgets. Quote Link to comment Share on other sites More sharing options...
Gowator Posted January 24, 2006 Report Share Posted January 24, 2006 Have you ever noticed how many of the power supplys in old computers are still running today???. This is because electronic engineers in the past allowed for up to a 50% excess margin, and often 75%, on their design usage. ( My computer could get by with a 300watt PS but I have a top quality 450watt unit in my machine.......home built).Dells problem is showcasing that they are cutting costs by cutting corners and using the cheapest barely adequate power supplys.......that is not good engineering no matter how much they advertise about their quality !!!!. <{POST_SNAPBACK}> This is also indicitive of the expected lifetime of a machine. Back in the 80's you damned well expected that machine to last... and people bought machines to last. Today we live in a throw away society added to the rate of progress on PC's ever increasing but we largely buy disposable items. Back in the 80's you could build your own PC for 20% less than a pre-made on because they were put together with care... now it costs 50% more to make your own. Back then an entry level PC cost $1000++ now its $300+ If you pay peanuts then you will get monkey's doing your computing... A big difference is that in contrast to the solod state components in a PC the PSU has moving fans and coils ... whereby the cost of a chip is the price of sand + energy once its being produced the cost of making coils and fans involves labor .. Hence as John says the PSU has become the easiest part to cut corners and costs. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.