Results 1 to 6 of 6

Thread: Client stuck for over a day now

  1. #1

    Client stuck for over a day now

    On my only dual proc machine one of the clients is totally stuck on a generation for over a day now. Here's the some of the output from dfGUI that shows the client restarting alot with no progress being made.

    Code:
    7/25/2003   11:58:50 AM   Protein Size: 96
    Client Status: Appears to be running.
    Current Generation: 44   Generations Buffered: 0
    Structs Complete: 1   Structs Remaining: 49
    Best Energy: 10000000.000   Client Run Time: 0:00:05:00
    Time To Complete:    Bench Run Time: 0:00:05:10
    Prev Generation Time: 0:00:13:30   Avg Generation Time: 0:00:13:11
    Structs/Day: -
    # Restarts: 94
    
    
    7/26/2003   10:38:51 AM   Protein Size: 96
    Client Status: Appears to be running.
    Current Generation: 44   Generations Buffered: 0
    Structs Complete: 1   Structs Remaining: 49
    Best Energy: 10000000.000   Client Run Time: 0:00:02:45
    Time To Complete:    Bench Run Time: 0:00:02:30
    Prev Generation Time: 0:00:13:30   Avg Generation Time: 0:00:13:11
    Structs/Day: -
    # Restarts: 344
    Progress.txt shows -1 generations buffered (???)
    Code:
    Building structure 1 generation 44
    49 until next generation
    -1 generations buffered
    Best Energy so far: 10000000.000
    filelist.txt contains the following:
    Code:
    CurrentStruc 0 1 127 44 1 0 10000000.000 10000000.000 -10000000.000 0.000 0.000 0.950 1.700 330.623 ------------------HHHHHHHHHHH------HHHHH----------------HHHHHH-----------------HHHH-------------
    2d589eac609cf27732ca09af52a5a7e8
    Looking at my other machines they all have some files listed at the beginning of filelist.txt

    The following files are in my directory (no other result files present):

    fwigarf2_0_fwigarf2_protein_40_0000011_min.val
    fwigarf2_0_fwigarf2_protein_43_0000020_min.val.bz2
    fwigarf2_protein_44.trj

    <sigh>I just had the other client on this machine start back over again at gen 0 from gen 67. (Maybe because I was looking at the filelist.txt file for it and it couldn't be written to by the client so it started over.) Running the client on this machine isn't worth the constant hassle and electricity cost so I'm considering removing it from the rotation.

    Any ideas on the above "stuck" client is welcome. Thanx in advance.
    Last edited by BuddhaMan; 07-26-2003 at 02:24 PM.

  2. #2
    Nevermind. Said machine is being moved to another DC project.

  3. #3
    I am actually having the same problem.

    ------------------------------------------------------------
    Distributed Folding Windows dfGUI v3.1 Benchmark

    Current Generation: 109
    Sample Size : 0 structures over 141 seconds.
    Protein Size: 96AA

    Structures Per Hour : -
    Structures Per Day : -

    OS : Windows XP MHz: 1695
    CPU: Intel(R) Pentium(R) 4 CPU 1700MHz
    Client Switches: -rt -g 1
    ------------------------------------------------------------

    Building structure 1 generation 109
    49 until next generation
    0 generations buffered
    Best Energy so far: 10000000.000


    Basically while calculating residue, it gets to 20, and just stops. It will try alternate comformation over and over, never making any gains. Seems like the fact that no Gens are buffered, is keeping it from advancing.

    Any ideas about how to fix this problem?

  4. #4
    This is not a problem, this is intended behavior. You should observe the 'laxness' parameters increase slowly. eventually it will get free. Remember, a watched pot never boils, or so they say.
    Howard Feldman

  5. #5
    Senior Member
    Join Date
    Jul 2002
    Location
    Kodiak, Alaska
    Posts
    432
    You're not running in quiet mode, LagosAzul. Under WinXP we've noticed about 30% of the cpu time spent on processes that disappear when in quiet mode. Thus, you end up with higher performance that way (10-15% more being produced?)
    Even so, I had a machine that took over a day to get past a single generation during the beta; and another of the beta testers spent 2 days. And then it raced through to gen 250, and started over..
    If you have several machines, then you can see the law of averages working a bit better. Sometimes the faster machines are much faster than the slow ones.. and sometimes the slower ones run faster than the fast machines. (Run more machines!!! )

  6. #6
    Thanks for the reply's, I must have switched out of quiet mode when I was messing with settings trying to find a fix. I do usually run in quiet. Well good to know that it was not a bug.

    ps. I'm trying to get 2 other machines going...I'm not that persuasive though

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •