Results 1 to 13 of 13

Thread: Linux-Need badblocks program advice

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Old Timer jasong's Avatar
    Join Date
    Oct 2004
    Location
    Arkansas(US)
    Posts
    1,778

    Linux-Need badblocks program advice

    Okay, I've got a hard drive that might have a bad write head. But if it IS bad, it's only bad about 1 in 500 million times, in terms of individual bits(1s and 0s). So I'd never notice a problem in a text file, but it could easily screw up data for distributed computing.

    So, unless there's a better program out there, I need help with a good combination of variables for the badblocks program. I read something about hard drive buffers, and I know very little about hard drives, so I'm hoping someone can feed me a command to try. The hard drive is about 120GB in size(base-10 GB, about 112GiB) and I think the first pass should trigger a mistake if the problem is truly the write head.

    Lastly, don't worry about ruining the data on the drive. Anything that the drive had that was important is long gone, right now I just need to determine if I need to ditch the drive.(I mean recycle when I say ditch, I'm a good boy )

  2. #2
    Target Butt IronBits's Avatar
    Join Date
    Dec 2001
    Location
    Morrisville, NC
    Posts
    8,619
    Need make and model of HDD.
    Each one is different.
    Manufacturers usually have a utility on their websites you can download and use to test/diagnose and map out the bad sectors.

  3. #3
    Dungeon Master alpha's Avatar
    Join Date
    Mar 2002
    Location
    Norfolk, UK
    Posts
    1,700
    Yeah, the utilities IB is referring to can be hit or miss sometimes, but that's what I'd start with too. I've had some success with them in the past. Most of them are burnable ISOs which you then boot from a CD, thus removing any OS-interoperability issues.

    How did you go about diagnosing the problem and deciding there is one error every 500 million writes?

  4. #4
    Old Timer jasong's Avatar
    Join Date
    Oct 2004
    Location
    Arkansas(US)
    Posts
    1,778
    Quote Originally Posted by alpha View Post
    How did you go about diagnosing the problem and deciding there is one error every 500 million writes?
    Well, my conclusion isn't totally scientific, but basically...

    I've been having trouble with my quad-core for a while. I like to have the same Linux distro on my laptop as my quad-core, but it's only the quad-core that consistently suffers from problems. I tried to run p-1 on it a few weeks ago, and it got some fairly bad errors, so I assumed it was a cpu or RAM problem. The thing is that even though I continuously have errors with installs and distributed computing stuff, the Prime95 torture test and a bootable iso RAM test revealed absolutely nothing. After I thought about, I realized that the errors disappeared when most of the hard drive activity just involved(or mostly involved) hard drive reads. The 1 in 500 million is a SWAG, based on the fact that some installs boot and some don't. Suffice it to say that you might not notice the errors if you were just using the computer for word processing.

  5. #5
    Dungeon Master alpha's Avatar
    Join Date
    Mar 2002
    Location
    Norfolk, UK
    Posts
    1,700
    How long are you running the Prime95 test? How long are you running the RAM test? These need to be run for 12+ hours really. What are your temps like?

    Have you run the hard disk manufacturer's diagnostic utility yet? Were there any errors? How is your disk's SMART status?

    When you're referring to the "errors" and "problems" that give you reason to think something is wrong, what are they, specifically? If you can give actual error messages it always helps to track the problem down.

    Need lots more info to narrow this down.

  6. #6
    Old Timer jasong's Avatar
    Join Date
    Oct 2004
    Location
    Arkansas(US)
    Posts
    1,778
    Quote Originally Posted by alpha View Post
    How long are you running the Prime95 test? How long are you running the RAM test? These need to be run for 12+ hours really. What are your temps like?

    Have you run the hard disk manufacturer's diagnostic utility yet? Were there any errors? How is your disk's SMART status?

    When you're referring to the "errors" and "problems" that give you reason to think something is wrong, what are they, specifically? If you can give actual error messages it always helps to track the problem down.

    Need lots more info to narrow this down.
    I ran the RAM for 88.5 hours without an error and I ran Prime95 for a couple days(blend test) with no errors.

    I really wanted to get my stuff working again, so I'm running boinc on a bootable linux iso. Because of (1) my anxiety disorder, (2) my father's father is in really bad health, meaning my dad is stressed, and (3) me my dad don't get along that well anyway, I've decided it would be best to just ignore the problem and run BOINC on projects that can tolerate errors, assuming it's the cpu. But I still think it's the hard drive.

    I might decide to re-google the badblocks and hard drive stuff to try to solve the problem, but I can't afford to get fixated on it because that's when I start yelling at the computer. And I don't want to subject my father to that.

    Thanks for the help, though.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •