Page 2 of 3 FirstFirst 123 LastLast
Results 41 to 80 of 120

Thread: Beta 4 now available

  1. #41
    Senior Member
    Join Date
    Jan 2003
    Location
    North Carolina
    Posts
    184
    I have been working on the rare but serious bug that we couldn't track down in beta3. I can only reproduce it sporadically.

    I can sometimes get it by doing the following. I crunch with -rt -if -g1 for a while, then remove the f*.lock file while it is working on structure 1 of a new generation. I can only occasionally stop it at just the right time. I get the impression that I can't stop it too quickly after it starts structure 1, but that I must remove the f*.lock file before I see it working on structure 2.

    I will then have a fold_*.bz2 file with '1' as the middle number (although having such a file does not insure that I got it right).

    I then crunch with -rt -if -g0 (note the different -g switch) until it finishes the current generation and starts working on the next one.

    If the bug occurred, then the fold_*.bz2 file with a '1' in the middle will now be gone, but it will still be in the filelist.txt file. There is no fold_*.bz2 file for that generation.

    I will try to PM you a URL where you can get the work files. The bug1 directory is after successfully stopping the -g1 crunch at the right place, and bug2 is after the -g0 crunch. Let me know if there are any problems.

  2. #42
    Senior Member
    Join Date
    Jan 2003
    Location
    North Carolina
    Posts
    184
    More Bugs. I woke up to find one of my nodes not crunching, and the following in the error.log :

    FATAL ERROR: [023.024] {trajtools.c, line 2492} RLEUnPack failed, size=2, should=400*0 - likely this is caused by overclocked or faulty RAM chips, please test your RAM

    Note that none of my nodes are overclocked, and all have passed memtest86.

    The switches were: -rt -if -qt -p0 -g0

    This node was using the +=185 realtime clock acceleration (for a 2 second timeout).
    (During beta3 I a similar error on a different node, but also using the +=185 acceleration. As I recall, that previous time occurred during the minimize.)

    Sometime after the bug hit, my automatic upload script woke up. It removes the f*.lock file and waits for the client on the node to exit. In the morning it was still waiting. I suppose the node had put an error message on the screen and was waiting for keyboard input, but it had no monitor or keyboard so I couldn't tell. The easiest thing to do was to just reset the node.

    Then I got hit by a second bug. The error.log contained:

    ERROR: [001.001] {trajtools.c, line 3465} Unable to open trajectory distribution file <handle>_protein_239.trj
    FATAL ERROR: [002.003] {foldtrajlite2.c, line 4237} Unable to read trajectory distribution, please create a new one

    Indeed that file wasn't there. (How do you create a new one?) There was a file <handle>_protein_240.trj though.

    Apparently the reset left the work files in an inconsistent state, and the client was unable to recover. I consider that a bug.

    I made a backup of the directory, then I renamed <handle>_protein_240.trj to <handle>_protein_239.trj and the client was happy.

    BTW, the structure being crunched was the 5.18 one on my "AMD beta test account".

  3. #43
    Senior Member
    Join Date
    Mar 2002
    Location
    MI, U.S.
    Posts
    697

    Linux dfGUI port, v2.2beta, for the beta 4 DF client

    Have fun guys (if anyone that uses dfGUI for Linux is even using the beta client, that is... ).

    It's hosted in the same place as before, http://3dguios.resnet.mtu.edu/dfGUI-linux/#newversion

    EDIT: Wait a minute, there's a pretty big bug in the sucker. When a generation finishes, the benchmark data goes completely nuts, because the starting struct number needs to be reset. Hmmm... how to detect that a generation has been finished... Well, I'll see if I can't fix that here.

    EDIT AGAIN: OK, it's fixed and uploaded, but I've been seeing some strange behavior where it appears that the "stop client" button has gotten pressed spontaneously. Hopefully that's just related to the fact that I'm running the non-beta in a different directory on the same machine. Hopefully. If anyone else sees it, let me know.
    Last edited by bwkaz; 03-15-2003 at 08:22 PM.

  4. #44
    Ole has already identified this bug:

    FATAL ERROR: [023.024] {trajtools.c, line 2492} RLEUnPack failed, size=2, should=400*0 - likely this is caused by overclocked or faulty RAM chips, please test your RAM


    which is kind of interesting. It is caused because part of the protein wandered off the edge of conformational space (!). (In this case, conformational space is represented as the surface of a sphere, and it has hit the north pole which causes some strange problems. I had generally thought it impossible until now, but guess not. Anyways, I have fixed this (for the next version) so it cannot happen anymore, but for now it is, unfortunately, unrecoverable (so just delete fielilst.txt and restart from gen. 0 if this occurs). I'm not sure if the:

    ERROR: [001.001] {trajtools.c, line 3465} Unable to open trajectory distribution file <handle>_protein_239.trj
    FATAL ERROR: [002.003] {foldtrajlite2.c, line 4237} Unable to read trajectory distribution, please create a new one

    occurred on the same node afterwards?? If so, its probably a direct result of the first error but if it was on a different node and needs to be looked at let me know.

    I'll try to reproduce the tricky bug again that you mentioned above - so far I thin you're the only one whose seen it so it must have something to do with the switches you are using. I am still not clear at what point you see the first error message in that case - is in when you do the '-ut' option?
    Howard Feldman

  5. #45
    Senior Member
    Join Date
    Jan 2003
    Location
    North Carolina
    Posts
    184
    Originally posted by Brian the Fist ... Anyways, I have fixed this (for the next version) so it cannot happen anymore, but for now it is, unfortunately, unrecoverable (so just delete fielilst.txt and restart from gen. 0 if this occurs).
    Actually, I did recover. After the f*.lock file was deleted the client refused to exit, presumably because it had printed an error message and was waiting for a keypress (dispite the -qt switch). I didn't have a monitor or keyboard on that node, so I terminated the client by reseting the node (by means of its power cord). I had been using the -g0 switch (to avoid that other bug) so it hadn't been checkpointing, and as a result it "forgot" it had done that generation. The only problem was that the client apparently had already removed the .trj file for that generation and made one for the next generation. Once I renamed the .trj file the client was happy, and all 250 generations have now been successfully crunched and uploaded for that structure.
    I'm not sure if the:

    ERROR: [001.001] {trajtools.c, line 3465} Unable to open trajectory distribution file <handle>_protein_239.trj
    FATAL ERROR: [002.003] {foldtrajlite2.c, line 4237} Unable to read trajectory distribution, please create a new one

    occurred on the same node afterwards??
    Yes, it happened when I tried to start the client after the reset described above.
    I'll try to reproduce the tricky bug again that you mentioned above
    Did you get the work files? With the work files in the bug1 directory the critical step has already been done, so if you start with there and just do the second step it should be no problem. You just need to crunch with -rt -if -g0 until you've completed the generation and started the next one. Unlike the first step, it doesn't have to be stopped at any particular time, so you can leave it running and check on it when it's convenient. This step also has a high probability of working. Starting from the files in the bug1 directory, it worked all 4 times I tried.
    so far I thin you're the only one whose seen it so it must have something to do with the switches you are using.
    I think it needs the -if switch. I was hit several times when crunching without the -g switch (I use -g1 and -g0 when trying to reproduce the bug because it makes the bug much more likely to happen.) I used the -rt switch because I always do. I never tried it without that switch. I suspect no one else in the beta has seen this bug because not many testers are doing a lot of crunching with the -if switch and regularly stopping their client with a script that removes the f*.lock file.
    I am still not clear at what point you see the first error message in that case - is in when you do the '-ut' option?
    The client doesn't produce any error message when the bug actually happens. A fold_*.bz2 file for one of the generations quietly disappears from the directory (although it is still in filelist.txt). When the client running with -if is shut down and the results are uploaded by running with -ut, then the error message occurs because the client can't find the missing fold_*.bz2 file.

  6. #46
    Originally posted by Brian the Fist
    The last piece of data I will need is to see how everyone else does compared to AMD_is_logical once you've all completed your 60x slower 250 generations.
    I have been running beta4 since you released it but I am still only at generation 40 at the moment.

    At generation 37, structure 24 my laxness values were:
    1.400 2.600 1163.101

    At generation 40, structure 3 my laxness values are:
    1.600 3.000 2034.271

    This is on a P3-800 running -rt -g 1

    Jeff.

  7. #47
    Registered User
    Join Date
    Oct 2002
    Location
    Ottawa, Ontario, Canada
    Posts
    69
    Howard;
    it seems to me that with respect to # of generation, original sample size, # of retries on placing a residue etc, the 'try it and see' approach seems to be necessary. My suggestion is that ALL of these parameters that we may adjust with experience should be set by the server on a dynamic basis. That way each time a client starts up it can get the latest values.

    You can make the method of setting the values on the server side as simple or complex as you'd like. At the beginning I'd expect you'd simply set them manually and then simply re-set them periodically to try to determine the optimal values. After a while you could create a method of differentiating the results of client using different setting and then have sets of clients working with different values simultaneously.

    By adopting this approach you could roll out the beta to the generall population without having to first spend all of the time trying to determine the optimal values with very few clients working at it.

    You've probably noticed by now that this is one of my coding philosophies - set everything dynamically using parameters.

    ms

  8. #48
    At generation 40, structure 50 my laxness values are:
    1.800 3.400 3557.954

    So they increased quite a bit in that generation.

    Jeff.

  9. #49
    Originally posted by Digital Parasite
    I have been running beta4 since you released it but I am still only at generation 40 at the moment.

    At generation 37, structure 24 my laxness values were:
    1.400 2.600 1163.101

    At generation 40, structure 3 my laxness values are:
    1.600 3.000 2034.271

    This is on a P3-800 running -rt -g 1

    Jeff.
    Actaully, it would be useful to me id a few people (who have the time) could post the last line of their filelist.txt (the line starting with 'CurrentStruc') at the point when they are at structure 20-40 (whichever) within a generation, and for several generations after about 50 or up. Make sure it is at structure 20-40 in the generation though, this is important

    For example, do it for gen 55, struc 25, gen. 56 struc 30, gen. 57 struc 27 (or something like that).

    This is just so I can compare with AMD_is_logical (who already posted this info if you scroll up a few messages). Thanks!
    Last edited by Brian the Fist; 03-17-2003 at 01:38 PM.
    Howard Feldman

  10. #50
    My current values at generation 102 are the following.
    CurrentStruc 0 28 123 102 1 4 7.411 -2572.960 -994.298 -1013.360 99461552.000 1.050 1.900 437.252
    (If at some point you would feel that I would be better served by simply resetting to zero again I would be happy to do so.)
    A member of TSF http://teamstirfry.net/

  11. #51
    Senior Member KWSN_Millennium2001Guy's Avatar
    Join Date
    Mar 2002
    Location
    Worked 2 years in Aliso Viejo, CA
    Posts
    205
    At generation 89

    Filelist.txt
    .\fold_0_###_0_###_protein_88.log.bz2
    .\###_0_###_protein_88_0000011.val
    CurrentStruc 0 1 123 89 1 0 10000000.000 10000000.000 -10000000.000 0.000 0.000 0.850 1.500 250.000

    And Progress.txt
    Building structure 1 generation 89
    49 until next generation
    0 generations buffered
    Best Energy so far: 10000000.000

    I think it has taken 2 days to get from gen 80 to gen 89

    I have only one machine running the beta, a 1.8 Ghz P4 w/512 megs DDR RAM running XP pro and nothing other than DF beta.

    running as a service and the useram = 1 flag in service.cfg

  12. #52
    AMD_,

    I got the bug1 and bug2 dirs from you but I'm still very confused from your previous posts. Can you please give me a clear, concise, setp by step instructions of what you want me to do once I unzip the bug1 and bug2 dirs? Specifically how many times I need to start and stop the program, and what flags it should use each time? I tried running in the bug1 dir to the end of the generation and it seemed fine...
    Howard Feldman

  13. #53
    Senior Member
    Join Date
    Apr 2002
    Location
    Santa Barbara CA
    Posts
    355
    CurrentStruc 0 21 123 84 1 2 6.721 -556.506 873.475 110.176 4166943.500 1.300 2.400 879.471

    I can see that this could quickly get out of hand, so I am going to wait until I have several generations and put them all in the same post.
    Last edited by Welnic; 03-17-2003 at 02:43 PM.

  14. #54
    Senior Member
    Join Date
    Jan 2003
    Location
    North Carolina
    Posts
    184
    Originally posted by Brian the Fist
    AMD_,

    I got the bug1 and bug2 dirs from you but I'm still very confused from your previous posts. Can you please give me a clear, concise, setp by step instructions of what you want me to do once I unzip the bug1 and bug2 dirs? Specifically how many times I need to start and stop the program, and what flags it should use each time? I tried running in the bug1 dir to the end of the generation and it seemed fine...
    Start with the work files in the bug1 directory. Notice that there is a file fold_0*1*1.log.bz2 in both the directory and filelist.txt. As far as I can tell, all is as it should be.

    Now crunch with the switches -rt -if -g0

    Notice that it is crunching near the start of gen 1. Let it continue uninterupted until it is crunching gen 2. (You can let it go as long after that as you want. Just be sure it has started the ASCII graphics of gen 2 before you stop the client.)

    The contents of bug2 is what I had after doing the above. Notice that the fold_0*1*1.log.bz2 file is no longer there. It is still in filelist.txt, but not in the directory. There is no fold_*.bz2 file for gen 1 in the directory.

  15. #55
    structure #25 of gen. 84:
    CurrentStruc 0 24 123 84 1 21 6.695 -807.737 649.858 -65.892 3514353.250 1.050 1.900 437.252

    structure #28 of gen 85:
    CurrentStruc 0 27 123 85 1 9 6.697 -545.162 589.686 -125.918 3512862.750 1.200 2.200 665.006

    structure #21 of gen. 87:
    CurrentStruc 0 20 123 87 1 10 6.533 -727.888 419.502 -73.535 2480975.500 1.700 3.200 2690.320

    structure #23 of gen. 90:
    CurrentStruc 0 22 123 90 1 13 6.419 -1213.603 327.414 -101.658 4267804.500 1.100 2.000 502.840

    structure 26 of gen 93:
    CurrentStruc 0 25 123 93 1 7 6.392 -631.807 1099.677 -36.079 3397420.000 1.200 2.200 665.006

    All from w2k server

    G
    Last edited by Georgina; 03-18-2003 at 05:17 PM.

  16. #56
    Member
    Join Date
    Apr 2002
    Location
    Denmark
    Posts
    45
    CurrentStruc 0 34 123 75 1 31 7.909 -1847.480 -88.338 -558.075 28772618.000 1.400 2.600 1163.101

    Strucure 36 in gen. 75

  17. #57
    Please post several generations Currentstruc lines in one post, I need to see 3-4 of them from teh same user together, and maybe from 3-4 users and that's it, then you can stop. Thanks.
    Howard Feldman

  18. #58
    Senior Member
    Join Date
    Apr 2002
    Location
    Santa Barbara CA
    Posts
    355
    If you want to generate a file with the info that Howard needs, make a file with like this and have cron run it every 5 minutes. It will grab all of the structures in the 20s (and also structure 2) and append them to the file. Afterwards you can edit the file and you're ready. Depending on the machine you may have to adjust how often it runs.

    #!/bin/sh

    cat temp | grep CurrentStruc\ 0\ 2 >> output.txt
    cat temp | grep CurrentStruc\ 1\ 2 >> output.txt

  19. #59
    Senior Member
    Join Date
    Apr 2002
    Location
    Santa Barbara CA
    Posts
    355
    CurrentStruc 0 21 123 84 1 2 6.721 -556.506 873.475 110.176 4166943.500 1.300 2.400 879.471
    CurrentStruc 0 21 123 85 1 17 6.729 -705.449 1129.461 113.307 4769018.000 1.400 2.600 1163.099
    CurrentStruc 0 22 123 86 1 1 7.509 -543.301 -543.301 -10.866 295175.875 1.100 2.000 502.840
    CurrentStruc 0 21 123 87 1 10 6.853 -878.808 1055.511 75.562 5356430.500 1.400 2.600 1163.101
    CurrentStruc 0 22 123 88 1 8 6.849 -1004.899 1278.727 35.834 6161865.000 1.100 2.000 502.840

  20. #60
    XP 1800+
    CurrentStruc 0 26 123 104 1 16 6.438 -913.039 1202.665 -30.714 5906088.500 1.050 1.900 437.252
    CurrentStruc 0 2 123 107 1 1 6.990 286.935 286.935 5.739 82331.695 1.050 1.900 437.252
    CurrentStruc 0 44 123 117 1 5 6.547 -414.151 1366.844 361.686 15450094.000 1.050 1.900 437.252
    XP 1700+
    CurrentStruc 0 1 123 108 1 0 10000000.000 10000000.000 -10000000.000 0.000 0.000 0.900 1.600 287.500
    CurrentStruc 0 40 123 109 1 37 5.571 -1291.189 1077.074 -341.397 18956416.000 1.250 2.300 764.757
    CurrentStruc 0 16 123 112 1 5 5.706 -1223.127 394.204 -120.809 6028414.500 1.400 2.600 1163.099
    Duron 1.3G
    CurrentStruc 0 12 123 85 1 6 7.304 -1376.449 -311.703 -186.313 9276916.000 1.250 2.300 764.757
    CurrentStruc 0 23 123 86 1 17 7.232 -1602.302 -384.244 -456.128 27032274.000 1.100 2.000 502.840
    CurrentStruc 0 1 123 89 1 0 10000000.000 10000000.000 -10000000.000 0.000 0.000 1.400 2.600 1163.100

  21. #61
    Senior Member
    Join Date
    Apr 2002
    Location
    Santa Barbara CA
    Posts
    355
    Same box as the last post, just more of it.

    CurrentStruc 0 21 123 84 1 2 6.721 -556.506 873.475 110.176 4166943.500 1.300 2.400 879.471
    CurrentStruc 0 21 123 85 1 17 6.729 -705.449 1129.461 113.307 4769018.000 1.400 2.600 1163.099
    CurrentStruc 0 22 123 86 1 1 7.509 -543.301 -543.301 -10.866 295175.875 1.100 2.000 502.840
    CurrentStruc 0 21 123 87 1 10 6.853 -878.808 1055.511 75.562 5356430.500 1.400 2.600 1163.101
    CurrentStruc 0 22 123 88 1 8 6.849 -1004.899 1278.727 35.834 6161865.000 1.100 2.000 502.840
    CurrentStruc 0 20 123 89 1 1 6.904 -841.903 252.314 -116.810 3201866.000 1.050 1.900 437.252
    CurrentStruc 0 24 123 90 1 8 6.995 -1010.049 427.602 -156.914 6935936.500 1.100 2.000 502.840
    CurrentStruc 0 20 123 91 1 16 6.819 -533.324 1412.476 49.499 4960647.500 1.150 2.100 578.266
    CurrentStruc 0 20 123 92 1 3 6.945 5.095 1382.624 184.075 7352760.500 1.050 1.900 437.252
    CurrentStruc 0 23 123 93 1 2 6.933 -690.115 1412.229 167.838 8959461.000 1.100 2.000 502.840
    CurrentStruc 0 24 123 94 1 19 6.888 -373.639 1204.535 98.637 4798321.000 1.050 1.900 437.252
    CurrentStruc 0 20 123 95 1 12 6.730 -554.628 1477.573 262.763 13013899.000 1.100 2.000 502.840
    CurrentStruc 0 22 123 96 1 8 6.785 -528.053 1453.238 187.811 12790938.000 1.200 2.200 665.006
    CurrentStruc 0 22 123 97 1 10 6.854 -402.322 1202.075 215.828 10253236.000 1.200 2.200 665.006
    CurrentStruc 0 20 123 98 1 12 6.897 -222.747 1761.928 246.122 13030260.000 1.200 2.200 665.006
    CurrentStruc 0 20 123 99 1 8 6.806 -371.432 1327.447 196.520 8917435.000 1.350 2.500 1011.391

  22. #62
    Bug from AMD_is_logical fixed

    Ok, I have finally found the bug AMD_* had identified way back. This problem is related to the checkpointing feature that was recently added. In short, the checkpointing creates the fold*.log.bz2 file and the middle number of the file was one higher than it should have been. This creates all sorts of potentially weird anomalies including those described by AMD_*. I have made some changes to fix this but may have inadventantly fouled one of the other filenames up. I don't think so though, it looks like everything is right now.

    I will update the download files on the FTP site to incorporate this fix, I'll post a message when its been updated.

    Thanks to AMD for finding this bug and when I put the new one up, please download it and see if you can still break it at all..
    Howard Feldman

  23. #63
    Howard, are there any other changes you have made in the file you are going to post? ie: should we all download it to make sure everything is running smoothly?

    PS: I just got a new Dual AMD MP-2600+ machine last night so I will try unleashing the new beta on it. That will get me to the end of 250 generations much faster.

    Jeff.

  24. #64
    Senior Member
    Join Date
    Jul 2002
    Location
    Kodiak, Alaska
    Posts
    432
    Howard - after mentioning what a better job the Beta program is doing than the normal client, a teammate pointed out that the Beta is folding a different protein than the normal client. What's the difference between the two? (is it just a smaller protein that should have been faster to get to generation 250?)

  25. #65
    It seems to me as though the amount of secondary structure (SS) is slowly degrading generation after generation. It is degrading in the sense that alpha helices are being replaced by 3/10 helices, and eventually by turns and coil. This is logical given that each generation is inheriting trajectory graphs that are not filtered by the SS prediction, and thus, the amino acids are more likely to be placed in conformations that are not as conducive to good SS formation. If this trend continues, even for the beta, it might be wise to reapply the predicted SS filter. On the other hand, SS prediction is known to be only about 80% accurate, and strictly keeping to the predicted secondary structure might prevent us from ever getting close to the native structure... Nevertheless, if after 250 generations, the structures have no worthy SS, then Howard will clearly need to take this into consideration.

    just my $0.02

    -=Michel=-
    Last edited by eshell; 03-18-2003 at 04:19 PM.

  26. #66
    The Beta protein is the previous one we worked on on the live server (1CDZ, from 1/28-2/25, best struc was 7.10 A in 10 billion).

    Anyhow I've updated the beta files now, they are in the same place as always. It is up to you if you want to update it, you don't have to. If you do though, please try to test the -g option for me thoroughly though. Try it both with and without -if and try different -g values (especially -g0, -g1, -g2 and the default of -g5). To remind you, this affects not only progress.txt update frequency but now filelist.txt update frequency. In case of a crash/hard kill it should restart from teh last 'checkpoint' it made.

    Please note if you kill it at a particularly bad time, such as when it is writing out or compressing a .log or .val file for example, you will still be stuck in an unrecoverable state. Thus it is a help but doesn't mean you should now start killing the client improperly

    Watch carefully to see the behaviour when you stop it (either properly or improperly) and see if it starts off EXCATLY where it left of (in terms of structure number). For example with -g3, if you kill it while building structure #8, it should restart at #7 (since the last checkpoint was at the end of struc 6, the nearest multiple of 3). On the other hand, if it is stopped properly (by removing the .lock file or hitting Q) while building structure #8, it should then restart at structure #8.

    If killed/quit when minmizing or creating a trajectory distirbution, it should restart at that step again.

    Hopefully this is clear to everyone, if not just ask.
    Last edited by Brian the Fist; 03-19-2003 at 11:09 AM.
    Howard Feldman

  27. #67
    Just to let you guys know, I have now switched my gen 50 beta client from my old P3-800 to my new MP-2600+ running on 1 CPU for now. Instead of getting an average of 3 hours 30 minutes per generation it is getting about 1 hour 13 minutes and has already done about 10 generations since this morning.

    I will get to 250 much faster now...

    Jeff.

  28. #68

    Error

    Hi,

    Got the following error:

    ========================[ Mar 20, 2003 3:43 AM ]========================
    ERROR: [000.000] {foldtrajlite2.c, line 3863} Error during upload: STATUS 906 STRUCTURE COMPRESSION ERROR


    I've got some 21 gens backlogged because of this.

    switches: .\foldtrajlite -f protein -n native -g 1 -rt -df

    WinXP Pro (non-service)
    Team Anandtech DF!

  29. #69

    Re: Error

    Originally posted by m0ti
    Hi,

    Got the following error:

    ========================[ Mar 20, 2003 3:43 AM ]========================
    ERROR: [000.000] {foldtrajlite2.c, line 3863} Error during upload: STATUS 906 STRUCTURE COMPRESSION ERROR


    I've got some 21 gens backlogged because of this.

    switches: .\foldtrajlite -f protein -n native -g 1 -rt -df

    WinXP Pro (non-service)
    You should probably clarify if you are running the newest version of the beta 4 software or the older one.

    Edit: I'm also having trouble uploading structures, I suspect that the culprit is server problems for Distributed Folding. I probably should try the regular client and see if I can upload normally.

    Update: The server for the regular client appears to be working properly and I can upload units.
    Last edited by Aegion; 03-19-2003 at 09:28 PM.
    A member of TSF http://teamstirfry.net/

  30. #70
    Senior Member
    Join Date
    Jan 2003
    Location
    North Carolina
    Posts
    184

    Re: Error

    Originally posted by m0ti
    Hi,

    Got the following error:

    ========================[ Mar 20, 2003 3:43 AM ]========================
    ERROR: [000.000] {foldtrajlite2.c, line 3863} Error during upload: STATUS 906 STRUCTURE COMPRESSION ERROR


    I've got some 21 gens backlogged because of this.

    switches: .\foldtrajlite -f protein -n native -g 1 -rt -df

    WinXP Pro (non-service)
    I'm crunching with the new beta4a (since last night) and all was fine for a while. About 2 hours ago my upload script ran and all four nodes got the same structure compression error as what m0ti got above. I'm using linux with -rt -if -g0 -p0.

    Trying to upload these results with the old beta4 client gave the same error message (except line 3862).

    And I had a sub 5A structure in progress.

    Should I switch back to the old beta4 client?

    Do I have to clean out the directories, or can this be fixed on the server side?

  31. #71

    Re: Re: Error

    Originally posted by AMD_is_logical
    I'm crunching with the new beta4a (since last night) and all was fine for a while. About 2 hours ago my upload script ran and all four nodes got the same structure compression error as what m0ti got above. I'm using linux with -rt -if -g0 -p0.

    Trying to upload these results with the old beta4 client gave the same error message (except line 3862).

    And I had a sub 5A structure in progress.

    Should I switch back to the old beta4 client?

    Do I have to clean out the directories, or can this be fixed on the server side?
    Its almost certainly the servers themselves and had nothing to do with the client.
    A member of TSF http://teamstirfry.net/

  32. #72
    ========================[ Mar 19, 2003 8:41 PM ]========================
    ERROR: [000.000] {foldtrajlite2.c, line 3862} Error during upload: STATUS 906 STRUCTURE COMPRESSION ERROR
    I also just noticed the same error. This is on the same box that I had previously reported results from filelist.txt

    G

  33. #73
    Senior Member
    Join Date
    Jan 2003
    Location
    North Carolina
    Posts
    184

    Re: Re: Re: Error

    Originally posted by Aegion
    Its almost certainly the servers themselves and had nothing to do with the client.
    Ok, I'll disable my upload script and just crunch offline for now.

  34. #74
    I changed some stuff with the beta CGI on the server before I went home today, so I probably broke it so just crunch offline until tomorrow when I fix it. If you will notice, the top 10 structures now shows the actual top 10 (before, it showed at most 1 per user, so that is why AMD_is_logical is now dominating the top 10). It is important that we now see the true top 10 movies and not just the top 1 from the top 10 users. Actually if its a minor error I might be able to fix it now, I have 10 minutes... if not, tomorrow morning Ill get the server going again.

    update
    Ok, fixed it (I think). Sorry 'bout that. Hopefully it won't disrupt your in-progress simulations.


    update 2
    Ok, now I think it is really fixed (gimme a break, its midnight after all..) If you find it is no longer continuing your movie (i.e. if AMD's #1 movie which is curently at gen 110 doesnt proceed to gen 111 now) do NOT delete any files yet. Check if the files are still on your machine (and it keeps trying to upload them each time) or if they are gone forever (in which case, well, its gone). If the files are stil on your machine I can probably fix it sitll. Its too late at night for me to figure out what the server will do with the stuff it got the last few hours so we'll see tomorrow. nighty-night all.:sleepy:
    Last edited by Brian the Fist; 03-20-2003 at 12:04 AM.
    Howard Feldman

  35. #75
    Senior Member
    Join Date
    Jan 2003
    Location
    North Carolina
    Posts
    184
    Ok, I turned my upload script back on and the upload seems to have worked.
    I got credit and finished the 4.99A structure. (Alas, it's RMS went the wrong way. I guess 50 structures per generation just isn't enough.)

    I'm glad it wasn't the new beta client. It crunches faster, and both my accounts have gotten their current best structure since I installed it.

  36. #76
    Ok, looks like you fixed it!

    Thanks!

    Now I've just got to try and get some better RMS values than the crap I've been getting!
    Team Anandtech DF!

  37. #77
    Senior Member
    Join Date
    Jul 2002
    Location
    Kodiak, Alaska
    Posts
    432
    Since it takes so long on slower machines to get to generation 100 (I moved to a faster machine a few days ago), can we have an option to have a larger starting pool to select the structure to work on from? Instead of the 5 or 10k we currently have, the option to increase the pool up to .. say 100k, so the slow machines have a much better chance of finding a great structure to work on?

  38. #78

    Exclamation New dfGUI beta2 client available

    Hey everyone, I have had a few spare moments to add a couple of features to the dfGUI beta client so a new version is available for download (same link sa before).

    Download new beta at:
    http://gilchrist.ca/jeff/dfGUI/dfGUIv22beta.zip

    v2.2beta2 (Mar. 19, 2003)
    - # generations is now configurable in Config window.

    - Config window now appears in centre of dfGUI window so people using low resolutions will still be able to see and access the window.

    - Added display to keep track of best energy seen since you started the GUI.

    - Added display to indicate time it took to complete the previous generation

    - Added display to indicate average time it has taken to complete each generation

    - Removed structures per second and minute since the new beta works more slowly these values no longer make sense

    - Modified the restart the inactive client code to reduce the chance of it restarting multiple copies at the same time.


    If you find any problem, please let me know.
    Jeff.

    Screenshot:
    Last edited by Digital Parasite; 02-15-2009 at 01:44 PM.

  39. #79
    Registered User
    Join Date
    Oct 2002
    Location
    Ottawa, Ontario, Canada
    Posts
    69
    The size of the starting pool is not set by whim. It's a balance between quantity and quality that Howard's trying to maintain. Doing this type of thing would be counter-productive, IMHO.

    ms

  40. #80
    Just a small request:

    For the details on the top 10 folds could you also provide text info?

    Specifically:

    the values of the various potentials for each generation and the value of the RMSD for each generation.

    This could be provided as a static text file or something.

    Thanks!
    Team Anandtech DF!

Page 2 of 3 FirstFirst 123 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •