Page 2 of 3 FirstFirst 123 LastLast
Results 41 to 80 of 86

Thread: Beta client bugs

  1. #41
    Wow, you guys dug up a can of worms real fast
    looks like we'll have our hands full for a while yet..
    I have had trouble accessing the beta server from home, and I assume it is not just me so I may need to give it a kick monday morning. You can continue to run the beta or wait now for us to look into and fix some of the reported errors and release a second beta (it may not be for a couple weeks as our lead programmer is off for a bit now..).

    The only thing I would ask is please read this thread before posting a 'bug'. If someone else has already posted it, or something similar, please refrain from posting it again. This helps us keep track of the problems.

    thanks for your excellent help.
    Howard Feldman

  2. #42
    In an nutshell from reading this thread, some others, and my own experience with 3 Linux boxen, one of the big issues seems to be that a fresh client can't start when the server is offline, but an already running client with a receipt.txt file seems to keep going fine. Perhaps a dummy receipt.txt file generated on gen 0 might be an answer?

  3. #43
    <-- Hasn't been able to access anteaterbeta all day...

  4. #44
    Similar error to what TazAmdmb posted above, but the strange 'garbage' characters are different.

    Sat Feb 14 01:08:09 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
    Sat Feb 14 01:08:39 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
    Sat Feb 14 01:09:09 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
    Sat Feb 14 01:09:39 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
    Sat Feb 14 01:09:39 2004 ERROR: [777.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
    Sat Feb 14 01:09:39 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
    Sat Feb 14 01:09:39 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: °«°«nse or unable to connect to server

    Sat Feb 14 01:10:25 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file $¼ found in upload list


    Checking filelist.txt, there was this:
    .\fold_0_*handle*_5_*handle*_protein_92.log.bz2
    .\*handle*_0_*handle*_protein_92_0000006.val
    .\fold_0_*handle*_5_*handle*_protein_93.log.bz2
    .\*handle*_0_*handle*_protein_93_0000010.val
    .\fold_0_*handle*_0_*handle*_protein_94.log.bz2
    .\*handle*_0_*handle*_protein_94_0000001.val
    .\fold_0_*handle*_5_*handle*_protein_95.log.bz2
    $¼
    .\fold_0_*handle*_0_*handle*_protein_96.log.bz2
    .\*handle*_0_*handle*_protein_96_0000003.val
    .\fold_0_*handle*_0_*handle*_protein_97.log.bz2
    .\*handle*_0_*handle*_protein_97_0000001.val
    CurrentStruc 0 6 134 97 1 1 30.416 -2577.607 417.272 -569.484 11154609.000 2.550 4.900 28951.203 ----HHHHHHHHHHH---------HHHHHHHHHHHH-------------------------------------------HHHH---------HHHHHHHH------HHHH-------------------
    16556d34825d2f8a0479c9af08b40bef

    Edit: The $¼ also has some additonal unprintable characters in it.

  5. #45
    Senior Member
    Join Date
    Apr 2002
    Location
    Near Frankfurt, Germany
    Posts
    106

    Exclamation

    Sat Feb 21 14:11:21 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to www.distributedfolding.org:80 (Unknown) {errno=No such file or directory}
    Sat Feb 21 14:11:21 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
    Sat Feb 21 14:11:42 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to www.distributedfolding.org:80 (Unknown) {errno=No such file or directory}
    Sat Feb 21 14:11:42 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
    Sat Feb 21 14:12:03 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to www.distributedfolding.org:80 (Unknown) {errno=No such file or directory}
    Sat Feb 21 14:12:03 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
    Sat Feb 21 14:12:03 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
    Sat Feb 21 14:12:25 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to ftp.mshri.on.ca:80 (Unknown) {errno=Invalid argument}
    Sat Feb 21 14:12:25 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to ftp.mshri.on.ca:80 failed: Unknown
    Sat Feb 21 14:12:46 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to ftp.mshri.on.ca:80 (Unknown) {errno=Invalid argument}
    Sat Feb 21 14:12:46 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to ftp.mshri.on.ca:80 failed: Unknown
    Sat Feb 21 14:13:07 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to ftp.mshri.on.ca:80 (Unknown) {errno=Invalid argument}
    Sat Feb 21 14:13:07 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to ftp.mshri.on.ca:80 failed: Unknown
    Sat Feb 21 14:13:07 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
    Sat Feb 21 14:13:07 2004 ERROR: [000.000] {foldtrajlite2.c, line 2197} Unable to check server status
    Sat Feb 21 14:13:29 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
    Sat Feb 21 14:13:29 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
    Sat Feb 21 14:13:50 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
    Sat Feb 21 14:13:50 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
    Sat Feb 21 14:14:11 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
    Sat Feb 21 14:14:11 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
    Sat Feb 21 14:14:11 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
    Sat Feb 21 14:14:32 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
    Sat Feb 21 14:14:32 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
    Sat Feb 21 14:14:53 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
    Sat Feb 21 14:14:53 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
    Sat Feb 21 14:15:14 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
    Sat Feb 21 14:15:14 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
    Sat Feb 21 14:15:14 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
    Sat Feb 21 14:15:14 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: No response or unable to connect to server
    Client crashed with a fatal exception.

  6. #46
    Social Parasite
    Join Date
    Jul 2002
    Location
    Hill Country
    Posts
    94
    Originally posted by erk
    In an nutshell from reading this thread, some others, and my own experience with 3 Linux boxen, one of the big issues seems to be that a fresh client can't start when the server is offline, but an already running client with a receipt.txt file seems to keep going fine. Perhaps a dummy receipt.txt file generated on gen 0 might be an answer?
    I run dial-up. If I want to start the client when not connected (or if the server is offline!), I just use '-i f' (without quotes) as a parameter. Works for me.

    mikus

  7. #47
    Beta client has locked up twice on me now. No error messages being generated.

    Happened while calculating trajectory distribution this time, not sure about the first time.

    Edit:
    Running win2k sp2
    Dual P4 1.7GHz Xeons
    2GB RAM
    .\foldtrajlite -f protein -n native -qf -it -rt
    No 3rd party apps related to DF running

    DF crashed over the weekend, the only other thing running on that box was GIMPS dedicated to the second processor.
    Last edited by Galuvian; 02-23-2004 at 11:11 AM.

  8. #48
    At the risk of repeating myself,

    When reporting bugs with teh beta client, please provide as much information as possible - your exact OS, amount of RAM, how to reproduce the problem you are getting, if known, any messages in the error log (and it is not necessary to post the WHOLE log if its the same error over and over...), the flags you were using in the foldit.bat (if other than the defaults), and whether you were using dfGUI or something similar, or running from the command line.

    Without all this information, we have little or no chance of fixing any bugs you may report. Thanks!
    Howard Feldman

  9. #49
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59
    no real bug. however I have this in the error.log:
    Mon Feb 23 19:09:20 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: °7 (<- in the log there is a rectangle between the ° and 7)

    leaded and followed by:
    Mon Feb 23 19:09:20 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Timeout) {errno=No such file or directory}
    Mon Feb 23 19:09:20 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Timeout
    Mon Feb 23 19:09:20 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up

    and earlier i have a message:
    Mon Feb 23 18:54:58 2004 ERROR: [000.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192

    i have isdn... so possibly my isdn hung up..



    i also had things like this:
    Mon Feb 23 12:11:16 2004 ERROR: [000.000] {foldtrajlite2.c, line 4680} File handle_1_handle_protein_8_0000007_min.val is corrupt, missing or has been tampered with; cannot continue - replace file and start again, or manually delete filelist.txt
    Mon Feb 23 12:11:16 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: Data file checksum failed

    however i hadn't to start at zero. it continued after restarting the client



    and finally once or twice I got:
    Mon Feb 23 11:31:46 2004 FATAL ERROR: [003.001] {foldtrajlite2.c, line 5816} Unable to fetch Biostruc

    had to restart the client. then it folded on....





    using win xp with sp1 and most updates
    using dfgui 3.3 beta
    athlon xp 2000+
    having 512mb ram ddr 333
    enough free diskspace (~400mb)
    df running about 3 hours per day
    switches: .\foldtrajlite -f protein -n native -qt -rt -i f -g 1

    hope this report contains enough informations as wished and needed..

  10. #50
    Senior Member
    Join Date
    Mar 2002
    Location
    MI, U.S.
    Posts
    697
    Originally posted by Chaser
    and finally once or twice I got:
    Mon Feb 23 11:31:46 2004 FATAL ERROR: [003.001] {foldtrajlite2.c, line 5816} Unable to fetch Biostruc
    This happens (on the non-beta client at least...) when you delete the files that DF is using inside the temp directory.

    Those files are named file*.cdx, file*.dbf, file*.fpt, and file* (with no extension). Basically I've gotten to the point where I don't delete any of those files unless DF is *not* running.

    If your system has a temp directory cleaner, that could be the culprit, too.
    "If you fail to adjust your notion of fairness to the reality of the Universe, you will probably not be happy."

    -- Originally posted by Paratima

  11. #51
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59

    Talking

    jupp! i think you're right! i think i cleandes my temp dir didn't watch for those files. at least in this version you just have to restart the client. in former times the client had to begin at zero..

    @howard... why not create a temp folder in the df folder?!

  12. #52
    Senior Member
    Join Date
    Apr 2002
    Location
    Santa Barbara CA
    Posts
    355
    I have a dual Athlon box running Mandrake 9.0. I am running one beta client normally and the other with the -if flag. On the normal client when you run with the -ut flag to upload it stays in the normal terminal mode and prints out a line for each upload that happens. With the beta client it goes to black background that you normally see when you are actually folding and then starts printing out the upload lines. These get put one under the other until it gets to the bottom of the screen. Then the next one gets added to the right of the last and then you don't see anymore. So you can only see how it is progressing for about the first 18 with my terminal size.

  13. #53
    Originally posted by Welnic
    I have a dual Athlon box running Mandrake 9.0. I am running one beta client normally and the other with the -if flag. On the normal client when you run with the -ut flag to upload it stays in the normal terminal mode and prints out a line for each upload that happens. With the beta client it goes to black background that you normally see when you are actually folding and then starts printing out the upload lines. These get put one under the other until it gets to the bottom of the screen. Then the next one gets added to the right of the last and then you don't see anymore. So you can only see how it is progressing for about the first 18 with my terminal size.
    Yep thats a bug - easy to fix though, thanks
    Howard Feldman

  14. #54
    The most prevalent new bug then seems to be some filenames with strange characters in them (like squares and symbols). We will try to find teh source of this and fix it ASAP
    Howard Feldman

  15. #55
    Junior Member
    Join Date
    Jun 2002
    Location
    Australia
    Posts
    11
    Been running beta client on and off for a while and got this today, the client had no other distinct errors (other than not connecting to server), it just stopped .. It was up to gen166 at the time

    Fri Feb 27 10:00:19 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file T7 found in upload list

    ========================[ Feb 27, 2004 11:25 AM ]========================
    Starting foldtrajlite built Feb 11 2004
    Fri Feb 27 11:25:32 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file T7 found in upload list

    Running XP Pro sp1 with no updates (oops)

    Only flag is -rt

    Xp2100+ @ 2254
    256 Geil @ 204fsb
    Jetway n2pa-ultra NForce2
    GF2mx

    The box performs faultlessly 24/7 on DF


    System is headless

    Cheers

    Edit / found the o/s had no updates
    Last edited by hallmar; 02-27-2004 at 03:34 AM.
    OCAU Distributed Folding Team Member
    11.5gig DF'ing 24/7
    Idle temp ?? What idle temp ??

  16. #56
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59
    yesterday i lost 342 generations filelist tampered - my pc crashed... fuc*!!!!!!!!!!!!!

  17. #57
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59
    well.. restarted the client.. then i tried to upload (about 30 gens). my internet broke down. so upload couldn't be finished. now i get the following message:

    Fri Feb 27 18:45:03 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Timeout
    Fri Feb 27 18:45:03 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up

    ========================[ Feb 27, 2004 6:45 PM ]========================
    Starting foldtrajlite built Feb 11 2004
    Fri Feb 27 18:45:12 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file mmdb2.h60 found in upload list
    the last message ist the important one!! hm... restart of the client doesn't solve the problem. usind dfgui 3.3beta, offline, quietmode, extra ram, win xp pro sp1, 512mb ram

    :/






    forgot to post the filelist.txt
    .\fold_0_######_4_######_protein_7.log.bz2
    .\######_0_######_protein_7_0000005.val
    .\fold_0_######_1_######_protein_8.log.bz2
    .\######_0_######_protein_8_0000002.val
    .\fold_0_######_7_######_protein_9.log.bz2
    .\######_0_######_protein_9_0000008.val
    .\fold_0_######_4_######_protein_10.log.bz2
    .\######_0_######_protein_10_0000005.val
    .\fold_0_######_7_######_protein_11.log.bz2
    .\######_0_######_protein_11_0000008.val
    .\fold_0_######_4_######_protein_12.log.bz2
    .\######_0_######_protein_12_0000005.val
    .\fold_0_######_8_######_protein_13.log.bz2
    .\######_0_######_protein_13_0000009.val
    .\fold_0_######_9_######_protein_14.log.bz2
    .\######_0_######_protein_14_0000010.val
    .\fold_0_######_6_######_protein_15.log.bz2
    .\######_0_######_protein_15_0000007.val
    .\fold_0_######_2_######_protein_16.log.bz2
    .\######_0_######_protein_16_0000003.val
    .\fold_0_######_5_######_protein_17.log.bz2
    .\######_0_######_protein_17_0000006.val
    .\fold_0_######_1_######_protein_18.log.bz2
    .\######_0_######_protein_18_0000002.val
    .\fold_0_######_1_######_protein_19.log.bz2
    .\######_0_######_protein_19_0000002.val
    .\fold_0_######_5_######_protein_20.log.bz2
    .\######_0_######_protein_20_0000006.val
    .\fold_0_######_1_######_protein_21.log.bz2
    .\######_0_######_protein_21_0000002.val
    mmdb2.h60
    .\######_0_######_protein_22_0000003.val
    .\fold_0_######_1_######_protein_23.log.bz2
    .\######_0_######_protein_23_0000002.val
    .\fold_0_######_8_######_protein_24.log.bz2
    .\######_0_######_protein_24_0000009.val
    .\fold_0_######_0_######_protein_25.log.bz2
    .\######_0_######_protein_25_0000001.val
    .\fold_0_######_8_######_protein_26.log.bz2
    .\######_0_######_protein_26_0000009.val
    .\fold_0_######_0_######_protein_27.log.bz2
    .\######_0_######_protein_27_0000001.val
    CurrentStruc 0 3 134 27 1 1 99.990 -244.406 961.789 71.738 984772.375 0.850 1.500 250.000 -----HHHHHHHHHHH-----------HHHHHHH---------------------------------------------
    -------HHHHHHHHHHHHH--------------------HHHHHHH---
    6f2a51f3f8c35a7e73b4bf1726a0f584
    Last edited by Chaser; 02-27-2004 at 01:14 PM.

  18. #58
    Originally posted by Pascal
    ========================[ Feb 14, 2004 8:50 AM ]========================
    Starting foldtrajlite built Feb 11 2004
    Sat Feb 14 08:53:04 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
    Sat Feb 14 08:53:06 2004 ERROR: [000.000] {foldtrajlite2.c, line 4355} Failed to query status for ticket anteaterbeta.blueprint.org_1076695936_31597

    using -if -qt -rt -g10 -- for offline folding
    and -u -- for uploading

    Upload and continuing in folding's not possible. Client waits for upload status from server.

    Hardware: Athlon 1400 TB-C, 1.4 GC/s, 512 MB DDR-RAM, Win 2000. No hardware failures since months!
    This should not be the case. The upload part will indeed get aborted due to inability to query the status, but the client should continue folding locally until it is ready to upload again, at which point the receipt will be checked yet again. If this is not the case for you and the client exits, I would like to take a look at your directory when this occurs, or you could provide detailed steps as to how the bug can be reproduced.
    Elena Garderman

  19. #59
    Originally posted by deranged128[OCAU]
    I've finally got some work being uploaded, only 197 gens at this stage, but the time taken to 'verifying upload on server' is far too long. The PC I'm running this from is an XP2100+ @ 1980 with 512 MB DDR333 ram, WinXP.

    I have a 512/128 ADSL connection and apart from another 5 PCs which are running DF exclusively there is no other network traffic. The time though to verify these uploads is really putting a dent in the production time. Total upload time for each gen is around 50 seconds so for 197 gens I'm looking at 2.75 hours to upload with nothing else happening.

    If this new back end verifies each upload and it is taking this long with just a few beta testers, what is it going to be like when the whole system is doing the same thing?

    The alternative, as I see it, is for the upload process to run concurrently with folding. ie when uploading the client continues to fold, alleviating some very substantial down time.

    Cheers,

    Barry
    The individual verification is necessary to avoid any disruptions in the consecutive upload of generations. This is in place to prevent previously occurring errors that were related to client-server timeouts without the proper communication. Keep in mind that we are only testing with one server machine, which slows down the individual upload time when many users are all uploading at the same time.

    If enough people are interested, it may be possible to add a flag to supress the verification messages, which may marginally speed up the process - for now you can try uploading in quiet mode.
    Elena Garderman

  20. #60
    Originally posted by jkeating
    Beta crashed on 1800+ AXP running WinXP. When I try to restart, it says: "Uploading fileset 1/4 to server...", but it just hangs there without uploading.

    Here is the error log:

    ========================[ Feb 13, 2004 11:45 AM ]========================
    Starting foldtrajlite built Feb 11 2004
    Fri Feb 13 17:35:15 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
    Fri Feb 13 17:58:20 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: NO_STATUS_FOUND
    Fri Feb 13 20:40:04 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
    Fri Feb 13 23:04:53 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
    Fri Feb 13 23:04:53 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: No response or unable to connect to server
    .....

    You should not be getting the status_not_found_error, as thise simply forces a re-upload of the most recently uploaded fileset. Thanks for pointing that out, it will be fixed shortly.

    As for the timeouts, as long as the server cannot be reached, the upload won't proceed and it should fold locally. When you say "it hangs", what exactly do you mean by that? Folding locally? Quitting? Stops on a verfication message and doesn't continue? And what flags were you running with when this occured?
    Elena Garderman

  21. #61
    Ol' retired IT geezer
    Join Date
    Feb 2003
    Location
    Scarborough
    Posts
    92

    Post dialup observations

    I started with a fresh copy of everything and ran with internet
    disabled. After getting 37 generations, I started my line monitor
    (MyVitalAgent), then dialed up ISP, and then signaled dfGUI to
    upload. The line monitor displays a count of bytes sent and received
    by "transaction". I'm not sure how much control information of the
    various protocols are included in these counts. With the upload of a
    "generation", it reported 116 bytes sent, then 486 received, then
    after delay, 1883 received. Upload of 37 generations required 29
    minutes. (Monitor keeps track of time connected which is another
    reason I use it.)

    The good thing... Beta upload gets a lot less information back
    than current upload, so line speed is not an important factor
    any more.
    The bad thing... Upload/ verification is a lot slower than current
    upload. Current upload is three generations a minute, beta is
    slightly less that one per minute.

    Ned...

  22. #62
    25/25Mbit is nearly enough :p pointwood's Avatar
    Join Date
    Dec 2001
    Location
    Denmark
    Posts
    831
    My beta client crashed this weekend with the following error:

    Sat Mar 06 07:39:38 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file RAND found in upload list
    This is on a standard P4 2ghz running WinXP, hasn't been OC'ed.

    I have the complete folder, so if you want it I can upload it.
    Pointwood
    Jabber ID: pointwood@jabber.shd.dk
    irc.arstechnica.com, #distributed

  23. #63
    Ol' retired IT geezer
    Join Date
    Feb 2003
    Location
    Scarborough
    Posts
    92

    Lightbulb Group generation results at Client

    Group the Results of Generations at the Client
    ++++++++++++++++
    I'm assuming that you are using a full function database to store
    the folding results.
    With any database with full recovery capability, a large majority
    of the processing is in the system overhead for the commitment of
    the updates to the DB.
    When you perform mulitple updates before you request DB
    commitment, you save on that overhead for an individual update.
    When I was involved with a benchmark/prototype for message
    processing once, we found that updating of the database in
    groups of 40 messages at a time was the perfect balance of
    response time versus overhead.
    For DF, the optimum number of updates per committment would
    depend on your server hardware and software.
    In your case, where the accumulated generation results would
    probably be all inserted in the same place in the DB, you
    wouldn't have the problem of scattered inserts and associated
    lockout problems.
    As an example, I estimate that if 100 units of resource were
    used to insert one record in the database, then 150 units of
    resource would be used to insert ten logical sequential
    records in the same database as one DB commit.
    That is a 660% inprovement in resource utilization!

    --------------------------------
    Now, at the client...
    First, in fast machines, the generation processing time is small.
    Even slow machines get the job done.
    Second, the data uploaded per generation is small.
    SOOOOO...
    -------
    Consider gathering 5 to 50 generations before attempting to upload
    the results, even if a permanent connection exists.
    Since the data uploaded per generation is small, bandwidth is not
    an issue; AND the processing resources required at the server
    is a major consideration.
    You might want to make the number of generations user selectable
    with a minumum and maximum.
    In the case where internet is not available, gather the maximum
    generations per upload.
    You would need to retain the "upload" request capability to force
    an upload at any time with the accumulated generations.
    --------
    During uploading of the beta, I noticed that the verification
    request actually took longer that the insertion request.
    That is probably because the verification request has to wait
    for the commit process for the insertion request to complete,
    before the verification request can start.
    --------
    In the beta, you are verifying the data is uploaded before
    continuing.
    In the case of commiting the update of many results, verifying
    that the last one was inserted correctly would give that
    validation.
    -----------
    In Summary, group the results for multiple generations at the
    client and then process these sequential results as one DB
    commit at the server.

    The advantage to the project is the gathering of data with
    far less resources...
    The alternative is to throw multiple copies of hardware at it...

    Ned

  24. #64
    Although your idea is good in theory, it would not apply in practice. The reason for that is simple - the upload of each generation depends entirely on trhe successful upload and processing of the preceding generation. So if we allowed the upload of 50 generations all at once, then proceeded to validate this and discoverd that the seond generation is somehow corrupted, this would only lead to a waste of resources, and to 48 useless generation we would then have to discard. Not to mention the fact that the error messages provided to the user in that case would be completely out of sync.

    Most people are happy with the option to fold offline, and it is pretty simple to write a script which uploads after X number of generations is buffered.

    Thanks for the suggestion though, it is good to know out users are on the lookout for improving the system .
    Elena Garderman

  25. #65
    Originally posted by pointwood
    My beta client crashed this weekend with the following error:



    This is on a standard P4 2ghz running WinXP, hasn't been OC'ed.

    I have the complete folder, so if you want it I can upload it.
    Yes, I would like to take a look at it if you still have it - please upload to ftp.blueprint.org - use the /incoming directory at the root level, logging in as anonymous.
    Elena Garderman

  26. #66
    Senior Member
    Join Date
    Jul 2002
    Location
    Kodiak, Alaska
    Posts
    432
    Originally posted by Stardragon
    Yes, I would like to take a look at it if you still have it - please upload to ftp.blueprint.org - use the /incoming directory at the root level, logging in as anonymous.

    And send them an email with the file name you uploaded, and all the system specs and details of the problem again..
    www.thegenomecollective.com
    Borging.. it's not just an addiction. It's...

  27. #67
    25/25Mbit is nearly enough :p pointwood's Avatar
    Join Date
    Dec 2001
    Location
    Denmark
    Posts
    831
    What email address should that be sent to?

    Anyway, it's oploaded now - the filename is "pointwood.zip".

    The machine is a standard Fujutsu-Siemens P4 2Ghz with 512MB mem.
    Pointwood
    Jabber ID: pointwood@jabber.shd.dk
    irc.arstechnica.com, #distributed

  28. #68
    Just drop an e-mail to trades@mshri.on.ca outlining the flags you were running with and what you were trying to do when the bug occured (e.g. buffering offline, then trying to upload the buffered results).
    Elena Garderman

  29. #69
    25/25Mbit is nearly enough :p pointwood's Avatar
    Join Date
    Dec 2001
    Location
    Denmark
    Posts
    831
    I tried mailing you, but it failed for some reason

    Anyway, I run it with the mem switch.
    Pointwood
    Jabber ID: pointwood@jabber.shd.dk
    irc.arstechnica.com, #distributed

  30. #70
    Please use the newly posted beta version and report any bugs in the new thread - http://www.free-dc.org/forum/showthr...&threadid=5804
    Elena Garderman

  31. #71
    Ned,

    rest assured we are using a 'professional' database backend, and it has been set up by people who know what they are doing, and customized for this specific job.
    Howard Feldman

  32. #72
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59
    P.S. How about saving a filelist_?_????????_protein_???.txt with every gen? If the last filelist.txt corrupts, you can just "-purgeuploadlist 1", rename the last filelist*.txt file, and roll on, since every filelist_?_????????_protein_???.txt would be valid for its and all previous gens.

    The client can repeat the above automatically until it hits a good file, or until it runs out of buffered gens. If the latter, then it just starts over automatically. Everything is, of course, recorded automatically in the error.log file. This behaviour can be set as an option; e.g. -autorecover will try to recover from crash, otherwise display error message and stop.
    from this thread:
    http://www.free-dc.org/forum/showthr...5&pagenumber=1

    give a comment!

  33. #73
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59
    ========================[ Mar 20, 2004 8:50 PM ]========================
    Starting foldtrajlite built Mar 9 2004
    Sat Mar 20 20:50:02 2004 FATAL ERROR: [002.000] {foldtrajlite2.c, line 1448} Cannot rename filelist.txt.tmp to filelist.txt - disk may be out of space
    my diskspace ran out of space - so i lost about 120 gens... It should be handled better?!!!!!!!!!!!!!!!!!




    ========================[ Mar 20, 2004 5:43 PM ]========================
    Starting foldtrajlite built Mar 9 2004
    Sat Mar 20 17:45:27 2004 FATAL ERROR: CoreLib [002.005] {ncbifile.c, line 715} File write error
    no Idea, what this was...

    running xp pro sp1, amd xp 2000+, 768mb ddr, offline, extra ram

    cya

  34. #74
    Originally posted by Chaser
    [B]my diskspace ran out of space - so i lost about 120 gens... It should be handled better?!!!!!!!!!!!!!!!!!
    There is nothing much that can be done if there is no space to write out the filelist. You should generally leave marginal amounts of space for file writing operations.

    The other error was also caused by the lack of disk space.
    Elena Garderman

  35. #75
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59
    ========================[ Mar 26, 2004 1:45 PM ]========================
    Starting foldtrajlite built Mar 9 2004
    Fri Mar 26 13:45:52 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1569} Upload list has been tampered with, please delete filelist.txt and try again
    had a look into filelist.txt with ultraedit32 .. there was an "?" at the very bottom of the filelist.txt (in a extra line).. i removed it and suddenly it could upload again... i saved the filelist.txt and also saved the filelist.txt that then worked...
    however the rest of the files are mostly uploaded

    edit:
    I looked at the file again via Notepad.. and whats there at the end of the file? Three ****i**squares!
    Last edited by Chaser; 03-26-2004 at 01:20 PM.

  36. #76
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59

    Angry

    *grr*

    Previous generation missing

    ========================[ Mar 28, 2004 9:43 PM ]========================
    Starting foldtrajlite built Mar 9 2004
    Sun Mar 28 21:43:42 2004 ERROR: [000.000] {foldtrajlite2.c, line 4620} Cannot find structure from previous generation .\########_2_########_protein_102_0000002_min.val; find it manually or delete filelist.txt to continue
    Sun Mar 28 21:43:42 2004 ERROR: [000.000] {foldtrajlite2.c, line 4963} Error during upload: Previous generation missing







    .\fold_2_########_1_########_protein_102.log.bz2
    .\########_2_########_protein_102_0000002.val
    .\fold_2_########_1_########_protein_103.log.bz2
    .\########_2_########_protein_103_0000002.val
    .\fold_2_########_8_########_protein_104.log.bz2
    .\########_2_########_protein_104_0000009.val
    .\fold_2_########_9_########_protein_105.log.bz2
    .\########_2_########_protein_105_0000010.val
    .\fold_2_########_3_########_protein_106.log.bz2
    .\########_2_########_protein_106_0000004.val
    .\fold_2_########_3_########_protein_107.log.bz2
    .\########_2_########_protein_107_0000004.val
    .\fold_2_########_0_########_protein_108.log.bz2
    .\########_2_########_protein_108_0000001.val
    .\fold_2_########_1_########_protein_109.log.bz2
    .\########_2_########_protein_109_0000002.val
    .\fold_2_########_8_########_protein_110.log.bz2
    .\########_2_########_protein_110_0000009.val
    .\fold_2_########_0_########_protein_111.log.bz2
    .\########_2_########_protein_111_0000001.val
    .\fold_2_########_5_########_protein_112.log.bz2
    .\########_2_########_protein_112_0000006.val
    .\fold_2_########_4_########_protein_113.log.bz2
    .\########_2_########_protein_113_0000005.val
    .\fold_2_########_0_########_protein_114.log.bz2
    .\########_2_########_protein_114_0000001.val
    .\fold_2_########_8_########_protein_115.log.bz2
    .\########_2_########_protein_115_0000009.val
    .\fold_2_########_1_########_protein_116.log.bz2
    .\########_2_########_protein_116_0000002.val
    .\fold_2_########_2_########_protein_117.log.bz2
    .\########_2_########_protein_117_0000003.val
    .\fold_2_########_0_########_protein_118.log.bz2
    .\########_2_########_protein_118_0000001.val
    .\fold_2_########_1_########_protein_119.log.bz2
    .\########_2_########_protein_119_0000002.val
    CurrentStruc 2 6 134 119 1 2 41.269 -2420.163 -288.118 -667.286 12191623.000 2.650 5.100 38287.996 ----HHHHH-HHHH------------HHHHHHHHHH-------------------------------------------HHHH---------HHHHHHH---------HHHH---------HHHH----
    ec640ed96d7f11cd6388c06ac2ad4803

    normally running offline, xp pro, amd athlon xp 2000+, 768mb ddr ram, enough free space

    Just found 2 foldtraj...exe clients running.. stopped one via taskmanager.... tried to upload.. bam: prev. gen. mission... shit!!!

    don't know how 2 clients could start up at the same time.. running dfgui 3.3 beta...

    cya

  37. #77
    Do you still have a copy of the directory after the error occured? I would like to take a look at it if it's still available.
    Elena Garderman

  38. #78
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59
    Originally posted by Stardragon
    Do you still have a copy of the directory after the error occured? I would like to take a look at it if it's still available.
    Directory is still available.. Name is vakyiu6n.rar.. Shall I upload it? I deleted only the rotlin.bin.bz2, so that the file is smaller.
    ~13mb

  39. #79
    Could you please upload that directory to ftp.blueprint.org/incoming. You can log in as anonymous. Please send an e-mail to trades@mshri.on.ca when you have placed your directory there. Thank you.
    Elena Garderman

  40. #80
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59
    Originally posted by Stardragon
    Could you please upload that directory to ftp.blueprint.org/incoming. You can log in as anonymous. Please send an e-mail to trades@mshri.on.ca when you have placed your directory there. Thank you.
    I uploaded it and send a mail...


    an other question:
    will the points computed with the beta client be transfered to the normal statistic?

Page 2 of 3 FirstFirst 123 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •