Page 2 of 2 FirstFirst 12
Results 41 to 73 of 73

Thread: Client test version available

  1. #41
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982

    Re: Re: Energy Plateau

    Originally posted by Brian the Fist
    I would appreciate if you could post a pic of this graph. It is probably caused by 'truncation' of the energy function (which ensures it stays between 0 and 100).

    Also, just to clarify, when people mention in this thread 'RMSD' - please indicate where you are getting this number from - I suspect some (many) of you are confusing RMSD with pseudo-energy, and the two are quite different. Right now the only place you can see your RMSD, I believe, is if you log in at the web site and view your personal stats.
    The RMSD I mentioned was from the personal stats (I've gone from ~11.22 to 9.96 with this new client)...the pseudo-energy scores are looking similar (in fashion, not value) to when we had that fast protein - certain values are cropping up again and again (the most popular value seems to be 24.990 on my clients)...

    Have seen some interesting graphs as well where it flatlines for a bit then seems to process as normal (only graph I captured is below):


  2. #42
    Ol' retired IT geezer
    Join Date
    Feb 2003
    Location
    Scarborough
    Posts
    92

    Energy Plateau Image



    I would appreciate if you could post a pic of this graph. It is probably caused by 'truncation' of the energy function (which ensures it stays between 0 and 100).
    The above is the results after running from scratch for two days...

    Energy graph reported by dfGUI is the energy value reported by "progress.txt".

    Note the various plateau's (can't remember my latin plurals...). Note it jumping back and forth between a couple of plateau's in a couple of occasions...

    Ned
    Last edited by Ned; 12-09-2003 at 07:43 AM.

  3. #43
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59
    @ned

    you have to upload the image on a webserver... so that we can see it...

  4. #44
    Senior Member
    Join Date
    Jul 2003
    Location
    Hamburg/Germany
    Posts
    386
    same here, value of the plateau is 30.546...

    Greets thor
    Attached Images Attached Images

  5. #45
    Here's my latest graph:
    Attached Images Attached Images
    Team Anandtech DF!

  6. #46
    Senior Member
    Join Date
    Mar 2002
    Location
    MI, U.S.
    Posts
    697
    Originally posted by PCZ
    bwkaz

    Why don't you install a more up to date distro ?
    Because it'd take a good week (or more) to recompile everything. I use LFS, and I refuse to go back to a binary distro...

    I've actually got another partition with a basic system (incl. glibc 2.3.2) set up to the point where I can boot it, but getting everything else (X, Mozilla, Eterm, etc.) installed got put on hold while I upgraded my server.

    I am sure that the latest DF exe is not the only program that wont work for you.
    Actually, it is. Everything else I have source to (except certain games, like Rune, UT, UT2k3, Descent 3, and RtCW, but those still work on glibc 2.2), and source will generally compile against any glibc. Unless it's too low-level (something that requires NPTL, for instance), but I don't have anything like that.
    "If you fail to adjust your notion of fairness to the reality of the Universe, you will probably not be happy."

    -- Originally posted by Paratima

  7. #47
    Same system.....
    Opteron 146 / Corsair PC3200 ECC Reg / WinXP

    Using dfGui 3.2 clicked 'upload' and 31/33 generations uploaded then the following error displayed:

    ========================[ Dec 8, 2003 6:45 PM ]========================
    Starting foldtrajlite built Dec 4 2003
    Mon Dec 08 18:46:21 2003 ERROR: [-04.000] {rotlib.c, line 100} Bzip decompression of SCWRL library error occurred
    Mon Dec 08 18:46:21 2003 FATAL ERROR: [001.008] {foldtrajlite2.c, line 4481} Cannot open rotamer library (rotlib.bin.bz2) - if the file is missing, please re-install the software


    The client has been running fine (nonet/manual upload) since reinstalling it after the fatal error on Saturday (or was that friday?).

    Edit....I hit "recover" and the client began again by uploading the last 2 generations and has continued now without issue.
    Last edited by TazAmdmb; 12-08-2003 at 11:48 PM.

  8. #48

    Best Energy so far: 24.990

    I am getting a lot of Linux machines running the beta with the same Best energy reported in the progress.txt file which might be a bug. I had 3 machines at once reporting the same reading:

    Best Energy so far: 24.990

    None with lower than 10 values like previous clients, it might just be this protein or a fluke.

    Any ideas?
    Last edited by erk; 12-08-2003 at 09:58 PM.

  9. #49
    Senior Member
    Join Date
    Jul 2002
    Location
    Kodiak, Alaska
    Posts
    432
    On a Winxp home machine (256Megs ram) I ran the client for a little while no-net. It had 3 buffered generations, so there were 6 files listed in filelist.txt. 4 of those files started with ".\" and 2 did not. (I believe they were entries 3 and 4) Foldtrajlite complained when I put the missing ".\"s before the 2 file names that were missing them; claiming that the file had been tampered with. I removed the right two ".\"s and it uploaded the 3 buffered gens.
    It's mentioned here because it's a strange result.
    foldtrajlite was the only thing running on the system - (other than NAV, critical update, etc)

    (this is running the latest beta client).
    www.thegenomecollective.com
    Borging.. it's not just an addiction. It's...

  10. #50
    If you would like another picture, this was my last set (I deleted filelist.txt after I took this screenshot):



    This is what I call a flat line
    Best Energy was 24.990.

  11. #51
    This all sounds perfectly normal, thanks for the feedback. Like I said the 24.990 or whatever other plateaus you see are from truncation of the energy function. It means your structures hs a lower energy than the 'minimum' estimated from a formula we derived.

    I shall be releasing one more minor variant (changing the scoring function slightly only) later this week. I'm glad the client seems to be working with no problems (other than some of those old ones we still havent figured out yet).
    Howard Feldman

  12. #52
    Originally posted by Brian the Fist
    This all sounds perfectly normal, thanks for the feedback. Like I said the 24.990 or whatever other plateaus you see are from truncation of the energy function. It means your structures hs a lower energy than the 'minimum' estimated from a formula we derived.
    Good to know!

  13. #53

    Angry Grrr..

    I have BETA cleints stopping all over the place this morning:

    ========================[ Dec 10, 2003 6:54 AM ]========================
    Starting foldtrajlite built 2003.12.04
    Wed Dec 10 06:55:24 2003 ERROR: [000.000] {ncbi_socket.c, line 1601} SOCK#1000[4]: [SOCK::s_Connect] Failed pending connect() to 38.112.109.80:80 (Timeout) {errno=Operation now in progress}
    Wed Dec 10 06:55:24 2003 ERROR: [000.000] {ncbi_connutil.c, line 877} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Timeout

    And the process quits.

    /edit more specifically it only quits if it has been recently started, if it was running before the timeout message it seems to keep running in 95% of the cases. If you start it after the timeout message it runs for a few minutes, gives another timeout message and quits. Eventually if you persist restarting it will stay running.
    Last edited by erk; 12-09-2003 at 04:27 PM.

  14. #54
    Senior Member
    Join Date
    Jul 2002
    Location
    Kodiak, Alaska
    Posts
    432
    That's normal behavior for the client - if it can't find the server, it refuses to start. The only way to get it to run when you're having connection problems is no-net. Since you're having problems with systems refusing to see the internet and the client crashing - scan your systems for worms like Blaster/Welchia etc - as they've been known to cause the types of problems you're describing. (A local "bad" router turned out to be a router overwhelmed with probes from all the Welchia infected systems. Gee.. I guess they saved a lot on anti virus software by not bothering to update the AV software or windows..

    Once the client can see the df server - it'll keep running, and complain everytime it attempts to contact the df server but fails.
    www.thegenomecollective.com
    Borging.. it's not just an addiction. It's...

  15. #55
    Senior Member
    Join Date
    Apr 2002
    Location
    Oosterhout, Netherlands
    Posts
    223
    Originally posted by Brian the Fist
    I shall be releasing one more minor variant (changing the scoring function slightly only) later this week.
    What do you mean with this? What are you going to change?
    Proud member of the Dutch Power Cows

  16. #56
    One of my systems had this sequence twice in the error.log:

    ========================[ Dec 9, 2003 6:46 AM ]========================
    Starting foldtrajlite built Dec 4 2003

    ========================[ Dec 9, 2003 7:25 AM ]========================
    Starting foldtrajlite built Dec 4 2003

    ========================[ Dec 9, 2003 9:55 AM ]========================
    Starting foldtrajlite built Dec 4 2003
    Tue Dec 09 13:53:58 2003 ERROR: [001.001] {bbox.c, line 293} ..87..

    Tue Dec 09 13:53:58 2003 ERROR: [001.001] {bbox.c, line 295} .. CA ..

    Tue Dec 09 13:53:58 2003 ERROR: [001.001] {bbox.c, line 296} ..1..

    Tue Dec 09 13:53:58 2003 ERROR: [001.001] {bbox.c, line 297} .. CA ..

    Tue Dec 09 13:53:58 2003 FATAL ERROR: [004.001] {bbox.c, line 309} b-d node crash; node not inserted

    ========================[ Dec 9, 2003 2:54 PM ]========================
    Starting foldtrajlite built Dec 4 2003

    ========================[ Dec 9, 2003 4:14 PM ]========================
    Starting foldtrajlite built Dec 4 2003
    Too many computers, too little time......

  17. #57
    Originally posted by [DPC]Mobster
    What do you mean with this? What are you going to change?
    Don't panic. He's not talking about OUR score he's talking about how the software scores the structures made to pick the best one to carry on to the next generation.

  18. #58
    Senior Member
    Join Date
    Apr 2002
    Location
    Oosterhout, Netherlands
    Posts
    223
    Originally posted by DocWardo
    Don't panic. He's not talking about OUR score he's talking about how the software scores the structures made to pick the best one to carry on to the next generation.
    Who's panicing?
    Thankx for your clarification
    Proud member of the Dutch Power Cows

  19. #59
    Originally posted by Rebels Haven
    One of my systems had this sequence twice in the error.log:

    Starting foldtrajlite built Dec 4 2003
    Tue Dec 09 13:53:58 2003 ERROR: [001.001] {bbox.c, line 293} ..87..

    Tue Dec 09 13:53:58 2003 ERROR: [001.001] {bbox.c, line 295} .. CA ..

    Tue Dec 09 13:53:58 2003 ERROR: [001.001] {bbox.c, line 296} ..1..

    Tue Dec 09 13:53:58 2003 ERROR: [001.001] {bbox.c, line 297} .. CA ..

    Tue Dec 09 13:53:58 2003 FATAL ERROR: [004.001] {bbox.c, line 309} b-d node crash;
    Well I'm glad someone posted a useful bug report finally Has anyone else received this or a similar error in their log? This is probably I bug, I will see if I can tell what causes it.
    Howard Feldman

  20. #60
    Administrator PCZ's Avatar
    Join Date
    Jun 2003
    Location
    Chertsey Surrey UK
    Posts
    2,428
    Howard
    I am running about 70 instances of the test client on XP,2K,98 and linux and so far I have not seen a single problem.

    I have been through the logs and the only errors are occasional failures connecting to anteater.blueprint.org.

    Also the memory usage has remained stable at 96MB and hasn't increased

    It is difficult to find BUGS when there aren't any

  21. #61
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    Well, this Test one seems to like getting stuck at the end of each gen making best trajectory etc..can sit there for hours, but if you stop and restart away it goes like shot out of a cannon. Did this on the old I think, but it seems a lot more frequent now
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  22. #62
    Senior Member
    Join Date
    Jul 2002
    Location
    Kodiak, Alaska
    Posts
    432
    Tue Dec 09 17:44:56 2003 ERROR: [777.000] {ncbi_connutil.c, line 877} [URL_Connect] Socket connect to anteater.blueprint.org:80 failed: Unknown
    Tue Dec 09 17:44:56 2003 ERROR: [000.000] {foldtrajlite2.c, line 4900} Error during upload: NO RESPONSE FROM SERVER - WILL TRY AGAIN LATER

    ========================[ Dec 9, 2003 11:00 PM ]========================
    Starting foldtrajlite built Dec 4 2003
    Wed Dec 10 06:57:48 2003 ERROR: [777.000] {ncbi_http_connector.c, line 284} [HTTP] Error writing body at offset 8192 (Timeout)
    Wed Dec 10 06:57:48 2003 ERROR: [000.000] {foldtrajlite2.c, line 4900} Error during upload: NO RESPONSE FROM SERVER - WILL TRY AGAIN LATER
    Wed Dec 10 07:07:45 2003 ERROR: [000.000] {foldtrajlite2.c, line 4812} Warning during upload: STATUS 910 MISSING PREVIOUS OR ILLEGAL GENERATION
    -----------------
    Joy of joys - my isp has been acting up the last few days - not helped out with city wide power outage. But it merrily uploaded fine in between Dec 9 at 17:44 and Dec 10 at 6:57 - and seems to have had a relapse of the handshaking errors that I've seen in the past.

    I also noticed that one of the win98se machines (amd xp 1700+ cpu, 256megs pc133 ram running at stock speeds) shut down during the power outage (probably crashed), and was turned back on when I got to work yesterday. I had the client running yesterday - but when I got to work today it had a message on the screen saying that foldtrajlite.exe had performed an illegal operation. No errors in the error log; and the same foldit.bat file used on the other win98se 256Meg machines at work. (.\foldtrajlite -f protein -n native -df -qt -rt) This is the first time it's popped up with an illegal operation during the months phase II has been in use. (I don't remember it having problems during Phase I, either).
    Hopefully it was a result of windows corruption after crashing - but I'll share it here, in case others also notice this happening on formerly problem-free systems.
    www.thegenomecollective.com
    Borging.. it's not just an addiction. It's...

  23. #63

    Just noticed another error

    I just spotted this on another system:


    ========================[ Dec 6, 2003 1:04 PM ]========================
    Starting foldtrajlite built Dec 4 2003

    ========================[ Dec 7, 2003 5:38 AM ]========================
    Starting foldtrajlite built Dec 4 2003
    Sun Dec 07 05:45:34 2003 FATAL ERROR: [003.001] {foldtrajlite2.c, line 5774} Unable to fetch Biostruc

    ========================[ Dec 7, 2003 5:48 AM ]========================
    Starting foldtrajlite built Dec 4 2003


    Hope these will end up helping everyone...
    Too many computers, too little time......

  24. #64
    The memory bloat isn't gone. After 8 days on a Knoppix box, top reported 206 MB for the beta foldtrajlite. Stoppping and restarting the client brought it back down to 93. Not a true leak, all the memory was released on exit. Back to having cron stop/restart it daily.

  25. #65
    Originally posted by RandomCritterz
    The memory bloat isn't gone. After 8 days on a Knoppix box, top reported 206 MB for the beta foldtrajlite. Stoppping and restarting the client brought it back down to 93. Not a true leak, all the memory was released on exit. Back to having cron stop/restart it daily.
    Are you sure it's the beta client? I have 8 linux boxen running Redhat 9 and CRUX 1.2 and none of them show memory leaks like that anymore.

  26. #66
    Senior Member
    Join Date
    Jul 2003
    Location
    Hamburg/Germany
    Posts
    386
    Today I spotted the following in my errorlog:

    ========================[ Dec 13, 2003 3:28 PM ]========================
    Starting foldtrajlite built Dec 4 2003
    Sat Dec 13 15:28:05 2003 ERROR: [000.000] {ncbi_socket.c, line 1517} SOCK#1000[?]: [SOCK::s_Connect] Failed SOCK_gethostbyname(www.distributedfolding.org)
    Sat Dec 13 15:28:05 2003 ERROR: [000.000] {ncbi_connutil.c, line 877} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
    Sat Dec 13 15:28:08 2003 ERROR: [000.000] {ncbi_socket.c, line 1517} SOCK#2000[?]: [SOCK::s_Connect] Failed SOCK_gethostbyname(ftp.mshri.on.ca)
    Sat Dec 13 15:28:08 2003 ERROR: [000.000] {ncbi_connutil.c, line 877} [URL_Connect] Socket connect to ftp.mshri.on.ca:80 failed: Unknown
    Sat Dec 13 15:28:08 2003 ERROR: [000.000] {foldtrajlite2.c, line 2179} Unable to check server status
    Sat Dec 13 15:28:08 2003 ERROR: [000.000] {ncbi_socket.c, line 1517} SOCK#3000[?]: [SOCK::s_Connect] Failed SOCK_gethostbyname(anteater.blueprint.org)
    Sat Dec 13 15:28:08 2003 ERROR: [000.000] {ncbi_connutil.c, line 877} [URL_Connect] Socket connect to anteater.blueprint.org:80 failed: Unknown
    Sat Dec 13 15:40:08 2003 FATAL ERROR: [002.000] {foldtrajlite2.c, line 1425} Cannot rename filelist.txt.tmp to filelist.txt - disk may be out of space

    ========================[ Dec 13, 2003 3:40 PM ]========================
    Starting foldtrajlite built Dec 4 2003

    ========================[ Dec 13, 2003 5:09 PM ]========================
    Starting foldtrajlite built Dec 4 2003

    ========================[ Dec 13, 2003 5:09 PM ]========================
    Starting foldtrajlite built Dec 4 2003


    It is an AthlonXP 1800 running Win2K SP4 with 512Mb RAM and the HD on which DF is running has about 13GB free.
    I'm also on an DSL connection which showed no problems so the connection problems might be on the other end.
    I loocked in the log because the beta client seemed do be dead slow. I saved the whole directory and I can upload it if you need it...


    Greets Thor

  27. #67
    Senior Member
    Join Date
    Jul 2003
    Location
    Hamburg/Germany
    Posts
    386
    Same thing just happended again with a fresh install!

    ========================[ Dec 13, 2003 7:48 PM ]========================
    Starting foldtrajlite built Dec 4 2003
    Sat Dec 13 21:54:40 2003 FATAL ERROR: [013.000] {foldtrajlite2.c, line 1425} Cannot rename .\filelist.txt.tmp to .\filelist.txt - disk may be out of space

    don't no why...worked the months before...

    Greets Thor

  28. #68
    run scandisk or norton disk doctor to make sure you dont' have some sort of disk error?

    you don't have quota's running on that system do you?

  29. #69
    Originally posted by Thor
    Same thing just happended again with a fresh install!

    ========================[ Dec 13, 2003 7:48 PM ]========================
    Starting foldtrajlite built Dec 4 2003
    Sat Dec 13 21:54:40 2003 FATAL ERROR: [013.000] {foldtrajlite2.c, line 1425} Cannot rename .\filelist.txt.tmp to .\filelist.txt - disk may be out of space

    don't no why...worked the months before...

    Greets Thor
    For you C programmer types, the first number in the error message (013 in this case) is simply the value of errno after calling rename() to rename the indicated files. So 13 is EACCES indicating a sharing violation or permission denied (something else writing/reading one or both of these files?)

    In the previous case its a 2, ENOENT, meaning A component of the from path does not exist, or a path prefix of to does not exist. It is interesting that in the second case there is no '.\' in front of the file name. Are the named files present in this case?

    Anyhow it sounds like another program you are running is interefering. Try disabling all GUI programs liek DFGUI, foldmonitor, etc., as well as Virus Scanners and Firewalls, and see if the problem still occurs. Add them back one by one to identify the culprit..

    Howard Feldman

  30. #70
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982
    Originally posted by Brian the Fist
    For you C programmer types, the first number in the error message (013 in this case) is simply the value of errno after calling rename() to rename the indicated files. So 13 is EACCES indicating a sharing violation or permission denied (something else writing/reading one or both of these files?)

    In the previous case its a 2, ENOENT, meaning A component of the from path does not exist, or a path prefix of to does not exist. It is interesting that in the second case there is no '.\' in front of the file name. Are the named files present in this case?

    Anyhow it sounds like another program you are running is interefering. Try disabling all GUI programs liek DFGUI, foldmonitor, etc., as well as Virus Scanners and Firewalls, and see if the problem still occurs. Add them back one by one to identify the culprit..

    Interestingly it sounds similar to the problem I posted in http://www.free-dc.org/forum/showthr...&threadid=4880

    Personally, I haven't had that problem with either test version (yet ) and am still using both dfGUI and dfMon - not that is 100% proof they aren't a/the cause of any problems

    Hopefully as other people experience this (not a good thing on their part, but for the project and 3rd party utils it is) it can be narrowed down and squished

  31. #71
    Senior Member
    Join Date
    Jul 2003
    Location
    Hamburg/Germany
    Posts
    386
    Well all the "surrunding" programms have been the same...
    I run the testclient #1 with dfgui 3.2. This combination finished a whole set of 250gen without any problems.
    The problems I posted about happened in the second set around gen 25.
    Again: None of the other programs running in the background changed!
    HD should be fine but I will check tomorrow in the evening.
    it is after midnight here in Germany....time to get some sleep.
    I still have both of the directories saved and can upload them if either Howard or Elena want to take a look....

    Good Night

    Thor

  32. #72
    Senior Member
    Join Date
    Oct 2003
    Location
    an Island off the coast of somewhere
    Posts
    540
    I just found 4 of my 15 test version#1 clients stopped at the end of a generation at structure 100. The generations varied from 29 to 242 - nothing common there.
    There were no errors in the error log. They had simply stopped sometime between Friday afternoon and Monday morning.

    Clicking 'STOP' and 'START' on dfGui let all four instances continue, again with no errors logged.

    These are running via dfGui, on W2K Adv. Server P4 platforms. Plenty of memory and disk space available.

    On the memory usage issue -

    I started all fifteen clients on the test#1 version at the same time. They have been running roughly 120 hours and the memory seems to be stable at just under 100MB, for the ones that did not stop over the weekend.
    Last edited by willy1; 12-15-2003 at 04:16 PM.

  33. #73
    It would be helpful if you could provide as much info as possible about the 'stopped' clients so we know what is happening. Useful things would include a screenshot, a complete directory listing with file dates and timestamps and maybe a screenshot of the Task Manager display for the process as well...
    Howard Feldman

Page 2 of 2 FirstFirst 12

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •