Results 1 to 31 of 31

Thread: Check Your Boxen FDC

  1. #1

    Check Your Boxen FDC

    Check your boxen for WU's from the 4.97 version.. there is almost a 100% Failure rate with some of these.


    check out XS's full topic here.

    http://www.xtremesystems.org/forums/...ad.php?t=95458

    and the R@H Coverage

    http://boinc.bakerlab.org/rosetta/fo...ad.php?id=1106



    cheers.
    XS Bullet2urbrain

  2. #2
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,506
    Blog Entries
    13
    Yeah, I noticed it too. What a PIA, I was just letting a number of the machines finish up their stash of Malaria Control and QMC to switch back to Rosetta as well. The linux boxes are ok as they aren't onto 4.97. Think I'll switch them all back to QMC until this is resolved.

    Bok

  3. #3
    Administrator PCZ's Avatar
    Join Date
    Jun 2003
    Location
    Chertsey Surrey UK
    Posts
    2,428
    Well i have just wasted a day trying to get my new Duallies running Rosseta WU's without errors only to discover that the WU's are BAD

  4. #4
    Senior Member Chuck's Avatar
    Join Date
    Aug 2005
    Posts
    406
    Blog Entries
    2

    getting bad WUs as well

    I am getting the bad WUs as well too....... TONS of them!!!

    I have had run anywhere from 30-40 seconds up to 60 minutes
    before failure.

    All boxen, regardless of OS are failing on single and dualies.

    C.




    A FDC in training, fellow supporter of Firefox.

    Proudly crunching with AMD & ATI power.
    If you want The Best you must forget the Rest
    >>>>>>>>>and join Free-DC<<<<<<<<<<<

  5. #5
    I'm running the v4.97 Ralph Wu's with out any Errors so far, I wonder what could be the difference between the Ralph & Rosetta Wu's in the same version. Luckily all my Rosetta Wu's are still v4.83's ...

    But one thing I notice is the Ralph Wu's I'm running are BARCODE_30's & the most complaints in the Rosetta Forum are about the HBLR Wu's ... I have some HBLR Ralph Wu's too but won't get to them for a few days yet, I'll have to keep an eye on them to see if they are a problem or not ...
    Last edited by Paladin*; 04-08-2006 at 04:57 PM.

  6. #6
    Target Butt IronBits's Avatar
    Join Date
    Dec 2001
    Location
    Morrisville, NC
    Posts
    8,619

    Quad - Dual Opteron 265 goodness

    Got it up and running finally, what a PITA!!!
    I'll create another post in the Hardware Forum when it's time...

    Anyways, doing DNET OGR to burn it in at stock speeds and until Rosetta problems go away.

    Saw this in another thread elsewhere - not good !!!
    Originally Posted by David Baker
    I'm really sorry about these problems. I checked yesterday on RALPH and everything seemed fine, but there clearly is a problem. Unfortunately, I'm just leaving for a family weekend trip so can't figure things out right away. Please bear with us for a couple of days.

  7. #7
    Target Butt IronBits's Avatar
    Join Date
    Dec 2001
    Location
    Morrisville, NC
    Posts
    8,619
    Quote Originally Posted by PCZ
    Well i have just wasted a day trying to get my new Duallies running Rosseta WU's without errors only to discover that the WU's are BAD
    Slap em on DPAD or something then.
    Whilst we all wait for a couple days on Rosetta to get their shite together.

  8. #8
    Administrator PCZ's Avatar
    Join Date
    Jun 2003
    Location
    Chertsey Surrey UK
    Posts
    2,428
    Actually IB i turned em off to save Electric

  9. #9
    =>Team Joker<= LAURENU2's Avatar
    Join Date
    Dec 2004
    Location
    Chicago IL USA
    Posts
    5,478
    Blog Entries
    1
    I just posted over at Rosetta and got this back

    Moderator9
    Forum moderator
    Joined: Jan 22, 2006
    Posts: 454
    ID: 53254
    Credit: 0
    RAC: 0
    Message 13288 - Posted 8 Apr 2006 23:11:41 UTC

    I just got this message from David Kim who is currently addressing this problem.

    "I just reverted back to the previous app. You should notice a version
    4.98 now, which is really version 4.83 for windows and mac, and 4.82
    for linux."

    You all should see some relief very soon. If you force an update it should load the new version once the server is set up.
    ____________
    Moderator9
    ROSETTA@home FAQ

  10. #10
    Senior Member Chuck's Avatar
    Join Date
    Aug 2005
    Posts
    406
    Blog Entries
    2

    Rosetta status 2300 CDT

    As of this moment... 2300 CDT 8-Apr-06...

    I did a regrettibly full reset of all machines, hence dumping the WUs.
    I had too many fail within 20 minutes of completion.

    Since the reset, the new exe downloaded as promised and the WU's are starting to run. I will advise if I see anything other than complete success.


    C.




    A FDC in training, fellow supporter of Firefox.

    Proudly crunching with AMD & ATI power.
    If you want The Best you must forget the Rest
    >>>>>>>>>and join Free-DC<<<<<<<<<<<

  11. #11
    Target Butt IronBits's Avatar
    Join Date
    Dec 2001
    Location
    Morrisville, NC
    Posts
    8,619
    Watching the new Quad fight for bandwidth to download WUs and with the Dimes clients running it's not helping I'm sure...
    But I got enough work to start running it.

  12. #12
    Senior Member Chuck's Avatar
    Join Date
    Aug 2005
    Posts
    406
    Blog Entries
    2
    I only have about 8 hrs of work / cpu at this point also.... I'm sure we're going to drain their WU generator and, as you cited, saturate their bandwidth.

    I did shorten my Queue to 24 hours worth of work / machine. It seems to be helping.

    IB, how is your quad behaving? It's naturual tendency to swizzle isn't starving anything is it? let me know if there is anythign I can do to help tune.

    C.




    A FDC in training, fellow supporter of Firefox.

    Proudly crunching with AMD & ATI power.
    If you want The Best you must forget the Rest
    >>>>>>>>>and join Free-DC<<<<<<<<<<<

  13. #13
    Christmas Lighterer!
    Join Date
    Mar 2004
    Location
    Upstate NY, USA
    Posts
    556
    Blog Entries
    1
    Quote Originally Posted by IronBits
    Watching the new Quad...
    Yummmm...

  14. #14
    Target Butt IronBits's Avatar
    Join Date
    Dec 2001
    Location
    Morrisville, NC
    Posts
    8,619
    Only had one bad WU error out on just one boxen so far
    Just bumped the Quad to 2GHz and still kicking, no voltage bumps required yet

  15. #15
    =>Team Joker<= LAURENU2's Avatar
    Join Date
    Dec 2004
    Location
    Chicago IL USA
    Posts
    5,478
    Blog Entries
    1
    All seems Better now all the red lins are fadding over the hills

  16. #16
    Target Butt IronBits's Avatar
    Join Date
    Dec 2001
    Location
    Morrisville, NC
    Posts
    8,619
    Survived over here!
    2.356 GHz running DNET OGR on the Quad, until I'm sure it's stable, then back to Rosetta.
    30% OC
    CPU1 reads 117F/48C and CPU2 reads 135F/57C (after two applications of two different thermal grease)

  17. #17
    Senior Member Chuck's Avatar
    Join Date
    Aug 2005
    Posts
    406
    Blog Entries
    2
    Quote Originally Posted by IronBits
    Survived over here!
    2.356 GHz running DNET OGR on the Quad, until I'm sure it's stable, then back to Rosetta.
    30% OC
    CPU1 reads 117F/48C and CPU2 reads 135F/57C (after two applications of two different thermal grease)

    That's a great temp for CPU1, but experience makes me concerned about CPU2. When you switched greases, You did the usuall purging and lapping? What compound are you using? AS-5 w/ aluminum or copper?

    I have my machines which typically run under 35C but aren't OC'd a full 30%.

    This makes me wonder... to which I ask.... do you think I can push up closer to 30% ??

  18. #18
    Target Butt IronBits's Avatar
    Join Date
    Dec 2001
    Location
    Morrisville, NC
    Posts
    8,619
    Well, it finally rebooted itself, so I've detuned it down to 2.2 for now.
    I noticed the HS is not near as smooth as the one that came on CPU1, so might need to do some lapping to see if it helps.
    30% is a typical Opteron OC mark to shoot for as a minimum.

  19. #19
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,506
    Blog Entries
    13
    My opty 170 runs typically at 57C all the time overclocked to 2.6Ghz right now. At 2.8Ghz where it was still stable it was 64C, which is just a bit too hot for my liking...

    Using AS5 and the stock cooler which some with the opty's (and I believe the x2's above 3800+, at least it's the same one I got with my 4200+)

    Bok

  20. #20
    Senior Member Chuck's Avatar
    Join Date
    Aug 2005
    Posts
    406
    Blog Entries
    2
    I realize this is off-topic, but I have a theory about the .130 vs .090 chips.

    May I ask for you gents to send me a PM regarding

    a: FSB
    b: Core voltage
    c: RAM voltage
    d: Ram Timing assuming 3-3-3-8 (aka... std PC-3200)
    e: Other latency timers, etc you tweaked.
    f: Any memory tuning you changed (SPD setting vs what you run now)

    I'm asking because I have a few chips that are unlocked and are 939 pin,
    090 preproduction as well as some 065. Goal is find the limits of the existing
    chipset(s) as well as new DDR2 and quad-channel / split bus switch fabric.

    I am going to use copper coolers (Gigabyte 3D-Max circular cooler) and
    cool using air cooling. I am going to be testing a quad dualie first.
    If that works, I am going to test linking the boards and running full 8xx series
    CPUs using NUMA link as well as plain Reflective memory.

    I'm going to do both SMP and quasi-ASMP (loose-SMP) testing.

    When done, I will hopefully be able to share the results privately for future
    use, less some proprietary details.


    This ties into what we are doing in that, if successfull, it will result in a smaller footprint for all of us.

    Also, if anyone knows which client(s) support defining cpu affinity, I
    would greatly appreciate it. I know Trux (?) used to support it under 4.x.


    TIA,
    C

    /* edit: Spec goals are: 25% or better OC and maintain CPU at 40-45C under full load with standard ram.
    All suggestions welcome... please remember I have the LDT and Numa fabric to contend with on an 8GB memory machine */

  21. #21
    Keeper of the Fridge PY 222's Avatar
    Join Date
    Jul 2002
    Location
    San Jose, CA
    Posts
    2,706
    Guys, sorry for being out of the loop but is everything ok on Rosetta and if not, what should I be looking for?

    I am going to bring the clients back up now so any advice would be helpful.

  22. #22
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,506
    Blog Entries
    13
    It was a minor glitch with one version (4.97) corrected pretty quickly too. You might want to make sure and do a reset on the clients if they haven't been running for some time in case they have existing jobs which have their deadline already passed.

    They will download new jobs and the 4.98 version which is running fine.

    Bok

  23. #23
    Keeper of the Fridge PY 222's Avatar
    Join Date
    Jul 2002
    Location
    San Jose, CA
    Posts
    2,706
    Thanks Bok for the heads up.

    How do I reset the client without getting a new ID for the box? If there is no way thorugh this, then I'll just rerun my script and install a new client on the box.

  24. #24
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,506
    Blog Entries
    13
    boinc -reset_project <url>

    will do it, keeps the same id, just dumps all the existing work and gets new work.

    Bok

  25. #25
    Senior Member
    Join Date
    May 2002
    Location
    New Jersey USA
    Posts
    115
    The client has been upgraded to 5.01 so keep an eye on your systems.

  26. #26
    Senior Member Chuck's Avatar
    Join Date
    Aug 2005
    Posts
    406
    Blog Entries
    2
    Thank you for the heads up!

    Just what we need... MORE CHANGE !

    <cross fingers> Hope this works </cross fingers>



    C.

  27. #27
    Except this version is supposed to squash some bugs and also help with resets so should be a good improvement. It also adds some new science for hopefully better results.

  28. #28
    =>Team Joker<= LAURENU2's Avatar
    Join Date
    Dec 2004
    Location
    Chicago IL USA
    Posts
    5,478
    Blog Entries
    1
    OUCH getting a lot of 300Hr long running job I am Aborting all of them

  29. #29
    300 hour? Is that with the new 5.01 client? I thought the latest client automatically aborted after 24 hours in case of trouble like that.

  30. #30
    Well, some of the new ones seem to make progress, but go on and on and on. I had one WU today that ran for just over 24 hours and another that ran just over 18 hours.

    I'd suggest if you have any FACONTACTS or HLBR WUs to keep an eye on them. It seems that most of the ones that are long runners are of those types.

  31. #31
    =>Team Joker<= LAURENU2's Avatar
    Join Date
    Dec 2004
    Location
    Chicago IL USA
    Posts
    5,478
    Blog Entries
    1
    Well it seems all is better now that they canceled the bad Wu's after 5.01And I purged the ones I had.

    I said 300 because that is what I thought it would take to finish the WU's at there rate of increase.
    Full Steam Ahead

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •