PDA

View Full Version : Beta client bugs



Stardragon
02-11-2004, 07:23 PM
Please post any beta-client related errors/issues in this thread.

ToeKnee
02-12-2004, 12:16 AM
I have the beta version "foldtrajlite.exe" running. Running on 3.0GHz with Hyperthread and 512MB. Windows XP shows no error messages in the command window but I have
a receipt.txt file which shows:
"anteaterbeta.blueprint.org_1076559753_518"
and the error file shows:
"
========================[ Feb 11, 2004 8:18 PM ]========================
Starting foldtrajlite built Feb 11 2004

========================[ Feb 11, 2004 8:21 PM ]========================
Starting foldtrajlite built Feb 11 2004
Wed Feb 11 23:06:00 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 908 INCORRECT PROTEIN
Wed Feb 11 23:08:28 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 908 INCORRECT PROTEIN
Wed Feb 11 23:10:45 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 908 INCORRECT PROTEIN
Wed Feb 11 23:13:21 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 908 INCORRECT PROTEIN
Wed Feb 11 23:15:40 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 908 INCORRECT PROTEIN
Wed Feb 11 23:19:22 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 908 INCORRECT PROTEIN
Wed Feb 11 23:21:38 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 908 INCORRECT PROTEIN
Wed Feb 11 23:29:10 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 908 INCORRECT PROTEIN"

Progress.txt:
"Building structure 1 generation 13
9 until next generation
5 generations buffered
Best Energy so far: 10000000.000"

I started with a clean download. protein size is 129

What is happening? let me know if you need more info. I have left it running.

Tony

tpdooley
02-12-2004, 07:17 AM
If you're getting that kind of error message with a clean download of the beta client and are putting it into a brand new directory "c:\betafolding\distribfold" for example.. then either the client or the server have the wrong protein.

Anyone else running the beta that has a clean error.log file to confirm whether it works elsewhere?

Digital Parasite
02-12-2004, 07:50 AM
It seems to be running fine for me:

========================[ Feb 12, 2004 7:45 AM ]========================
Starting foldtrajlite built Feb 11 2004

I dowloaded the latest client from the DF web site and installed that in a new directory, copied over my handle.txt file, then copied the new beta .exe file and started folding.


Elena: Since you want us to test this without running dfGUI, can you tell me if there have been any changes to the output files in the beta so I know if I have to adjust dfGUI at all?

Thanks,
Jeff.

ToeKnee
02-12-2004, 10:02 AM
If you check my "progress.txt" file you will note that a least 3 generations uploaded OK (generation 13 and only 9 buffered)
This is a hyperthreading box. I am also running a non-beta client (as a service) at the same time. Task manager shows 50% CPU time for each "foldtrajlite.exe".
The oldest log.bz2 file in the directory is .....protein_8.log.bz2
The oldest min.val.bz2 file is ........._7_0000003_min.val.bz2 (there is also an ........_8_0000006_min.val.bz2
The oldest val file is ........_8_0000006.val
I am checking all this after 51 generations
Hope this helps.

Can we see our beta upload results anywhere? Did anything get uploaded?

Tony

CSMan
02-12-2004, 10:41 AM
I left DF running overnight. Checked it this morning and my disk had run out of space so obviously DF couldn't write necessary files.

I tried to do the upload only switch without making any room on the drive, and it quit and error log said it couldn't write to filelist.txt.tmp.

I then made some room and then tried to upload my buffered generations. It just says "checking for newer version" then "verifying results of previous upload" and then quits. If I just start DF running normally and it completes a generation, it just keeps on going and doesn't seem to upload anything.
There's nothing in the error.log file accept for the simple cannot write error.

Stardragon
02-12-2004, 11:38 AM
Originally posted by ToeKnee

<snip>
the error file shows:
"
========================[ Feb 11, 2004 8:18 PM ]========================
Starting foldtrajlite built Feb 11 2004

========================[ Feb 11, 2004 8:21 PM ]========================
Starting foldtrajlite built Feb 11 2004
Wed Feb 11 23:06:00 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 908 INCORRECT PROTEIN
Wed Feb 11 23:08:28 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 908 INCORRECT PROTEIN
Wed Feb 11 23:10:45 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 908 INCORRECT PROTEIN
<snip>


There server was indeed reading the wrong protein version; it is now fixed.

CSMan
02-12-2004, 01:28 PM
Ok, I couldn't upload then. But after waiting for a few hours, it is uploading now. Don't mind me! :trash:

Stardragon
02-12-2004, 01:34 PM
If any of you are getting 910 errors, please see this post: http://www.free-dc.org/forum/showthread.php?s=&threadid=5571

I apologize for any lost time or inconvenience this has caused.

Stardragon
02-12-2004, 01:35 PM
<snip>
Can we see our beta upload results anywhere? Did anything get uploaded?

Tony

Yes, the beta results are availabe on http://anteaterbeta.blueprint.org

Digital Parasite
02-12-2004, 02:08 PM
:( :( Stardragon didn't answer my question above....

Elena: Since you want us to test this without running dfGUI, can you tell me if there have been any changes to the output files in the beta so I know if I have to adjust dfGUI at all?

Jeff.

Stardragon
02-12-2004, 02:49 PM
Originally posted by Digital Parasite
:( :( Stardragon didn't answer my question above....

Elena: Since you want us to test this without running dfGUI, can you tell me if there have been any changes to the output files in the beta so I know if I have to adjust dfGUI at all?

Jeff.

Sorry, Jeff, frantically testing more server functionality of this side :D
Send me a reminder to trades@blueprint.org and I will e-mail you a detailed outline of the changes.

dano
02-12-2004, 04:08 PM
I installed the bata on a laptop lastnight, it ran the first 10,000 and then 2 generations then crashed.

It was running with -rt -it switches and the network cable unpluged.

Error log:
========================[ Feb 12, 2004 4:58 AM ]========================
Starting foldtrajlite built Feb 11 2004
Thu Feb 12 04:58:01 2004 ERROR: [000.000] {ncbi_socket.c, line 1173} [SOCK::s_Connect] Failed SOCK_gethostbyname(www.distributedfolding.org)
Thu Feb 12 04:58:01 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} Socket connect to [url]www.distributedfolding.org:80 failed: Unknown
Thu Feb 12 04:58:01 2004 ERROR: [000.000] {ncbi_socket.c, line 1173} [SOCK::s_Connect] Failed SOCK_gethostbyname(www.distributedfolding.org)
Thu Feb 12 04:58:01 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} Socket connect to [url]www.distributedfolding.org:80 failed: Unknown
Thu Feb 12 04:58:01 2004 ERROR: [000.000] {ncbi_socket.c, line 1173} [SOCK::s_Connect] Failed SOCK_gethostbyname(www.distributedfolding.org)
Thu Feb 12 04:58:01 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} Socket connect to [url]www.distributedfolding.org:80 failed: Unknown.......

-Windoze XP home
-averatec 3150h
-xp1600+ cpu
-256MB ram

dano
02-12-2004, 04:48 PM
I had another client crash. this one had just started trajectory distribution and froze. There was no foldtrajlite.lock or progress.txt in the directory. It had buffered 40 gens and had this error:

========================[ Feb 12, 2004 9:42 AM ]========================
Starting foldtrajlite built 2004.02.11
Thu Feb 12 17:03:34 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: øÍ.AUS 901 RECEIVE_ERROR

I hit Ctrl-c and restarted it, running ok now.

Mandrake 9.2
amd64 3200
512Mb ram
-qf -it -rt

Mikus
02-12-2004, 04:54 PM
I use dial-up, so I upload a bunch of filesets at one time. With the beta, I did *not* get any downloads of page 'cgi-bin/foldtraj' interlaced with the uploads of the individual filesets, as I have been experiencing with the normal project.

mikus (running Linux)

pio
02-12-2004, 05:11 PM
I have had similar errors here too:

Starting foldtrajlite built Feb 11 2004
Thu Feb 12 16:46:24 2004 ERROR: [002.000] {foldtrajlite2.c, line 4480} Warning during upload: STATUS 908 INCORRECT PROTEIN
Thu Feb 12 16:46:28 2004 ERROR: [002.000] {foldtrajlite2.c, line 4817} Warning during upload: STATUS 910 MISSING PREVIOUS OR ILLEGAL GENERATION

and I also just received this error:

Thu Feb 12 16:50:02 2004 ERROR: [000.000] {foldtrajlite2.c, line 4680} File .\52oqrd8a_0_52oqrd8a_protein_36_0000002_min.val is corrupt, missing or has been tampered with; cannot continue - replace file and start again, or manually delete filelist.txt
Thu Feb 12 16:50:02 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: Data file checksum failed

Hope this helps.

Update: Another error has appeared in my error.log

Thu Feb 12 18:15:50 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: STATUS 906 GENERIC STRUCTURE ERROR

I'm starting fresh right now, hopefully I won't encounter these errors :swear:

dano
02-12-2004, 06:54 PM
The DF client will crash with the -it switch if the machine is not connected to the internet. This happens in Windoze and Linux.

DocWardo
02-12-2004, 09:49 PM
Anyone else getting a large number of buffered gens?

I've got a 24/7 connection and I got home and I have 29 gens buffered of the 188 ran since this morning.

just wondering why we are automatically buffering more gens.

ps, no errors or info in hte error.log

dano
02-12-2004, 10:25 PM
Originally posted by DocWardo
Anyone else getting a large number of buffered gens?

I've got a 24/7 connection and I got home and I have 29 gens buffered of the 188 ran since this morning.

just wondering why we are automatically buffering more gens.

ps, no errors or info in hte error.log

Same here

Hagar
02-12-2004, 10:31 PM
same here, even forcing the client to upload with -ut doesn't help
All it does is "Verifying status of previous upload..." and exit.

without -ut it does the same but keeps folding after verifying.
I'm running linux

CSMan
02-12-2004, 11:10 PM
I'm getting a large amount of buffered generations on a 24/7 connection too.


On another note anyone know why DF is trying to find C:\Windows\ncbi.ini? It's not there and I checked my DF folder and it isn't there either.

I was using Filemon from sysinternals and noticed foldtrajlite.exe trying to open it.

PCZ
02-13-2004, 03:44 AM
CSMan

I believe the DF clients are written using the NCBI toolkit.
DF identified itself as NCBI toolkit when getting updates from my proxy.

PCZ
02-13-2004, 08:13 AM
I have a lot of buffered gens building up.

However the client log is empty, only thing in there is a message saying the client has started.

Is there a way of logging failures to send WU's in ?
Is there a switch we should be using to make the logging verbose ?


This is the text from the receipt.txt file created in my beta client directory.

anteaterbeta.blueprint.org_1076632202_353

What do the numbers mean ?

Stardragon
02-13-2004, 11:00 AM
For all of you who are bufferring generation locally, please see the latest reply in this thread: http://www.free-dc.org/forum/showthread.php?s=&threadid=5572

ToeKnee
02-13-2004, 11:24 AM
I guess this is a "me too" post.
Lots of buffered generations. No errors.
Receipt file: anteaterbeta.blueprint.org_1076634692_9375
I am connected to the net Hi speed 24/7

Is any body able to upload?

Tony :cheers:

Paratima
02-13-2004, 11:29 AM
Look up. :rolleyes:

pio
02-13-2004, 11:45 AM
Thanks for the information, Elena! I can't wait for the server to start accepting our generations... it should be quite interesting to see how it handles the non-stop uploading of 100s of generations!

Ever since I have started fresh again with this beta client, there have been no errors in my error log! :elephant:

I do have one question though: Before I started fresh again, I uploaded something like 20 generations, but no stats have been recorded. I assume that is because of all the 910 or 908 error messages that I was receiving? :confused:

Stardragon
02-13-2004, 12:36 PM
The 908 error should not have hindered your statistics, at least not until you started getting the 910 error.
Overall it could have happened that you were not credited because of the version discrepancy, so if the stats are completely missing - my apologies.
Once I fix the server bug, you will be sure to get credited for all the buffered data you have.

deranged128[OCAU]
02-13-2004, 08:28 PM
I've finally got some work being uploaded, only 197 gens at this stage, but the time taken to 'verifying upload on server' is far too long. The PC I'm running this from is an XP2100+ @ 1980 with 512 MB DDR333 ram, WinXP.

I have a 512/128 ADSL connection and apart from another 5 PCs which are running DF exclusively there is no other network traffic. The time though to verify these uploads is really putting a dent in the production time. Total upload time for each gen is around 50 seconds so for 197 gens I'm looking at 2.75 hours to upload with nothing else happening.

If this new back end verifies each upload and it is taking this long with just a few beta testers, what is it going to be like when the whole system is doing the same thing?

The alternative, as I see it, is for the upload process to run concurrently with folding. ie when uploading the client continues to fold, alleviating some very substantial down time.

Cheers,

Barry

deranged128[OCAU]
02-14-2004, 02:15 AM
I've had the Beta client shutdown on me twice now. (foldtrajlite closes down with service error) and while it is trying to cache the completed generation, there were 3 in the folder, there was no receipt.txt file.

It seems if it isn't able to get a receipt file then it can't cache. Is that correct?

The error log shows:
Sat Feb 14 17:23:09 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 17:23:39 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 17:24:09 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 17:24:40 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 17:24:40 2004 ERROR: [777.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
Sat Feb 14 17:24:40 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 17:24:40 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: No response or unable to connect to server

At this point foldtrajlite crashed leaving the .lock file in place.

RandomCritterz
02-14-2004, 02:31 AM
Window's famous alert box "foldtrajlite has caused an error in <unknown>" No receipt.txt, error.log shows:

Sat Feb 14 01:11:05 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 01:11:05 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: x«x«nse or unable to connect to server

tak22
02-14-2004, 02:33 AM
crashed and unrecoverable with no receipt.txt file

========================[ Feb 13, 2004 7:02 PM
Starting foldtrajlite built Feb 11 2004
Fri Feb 13 21:06:52 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 21:09:23 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 21:09:23 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: No response or unable to connect to server

========================[ Feb 13, 2004 11:20 PM ]
Starting foldtrajlite built Feb 11 2004
Fri Feb 13 23:20:51 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file ® found in upload list

on an XP2500+, 512MB, WinXP

Pascal
02-14-2004, 03:08 AM
========================[ Feb 14, 2004 8:50 AM ]========================
Starting foldtrajlite built Feb 11 2004
Sat Feb 14 08:53:04 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 08:53:06 2004 ERROR: [000.000] {foldtrajlite2.c, line 4355} Failed to query status for ticket anteaterbeta.blueprint.org_1076695936_31597

using -if -qt -rt -g10 -- for offline folding
and -u -- for uploading

Upload and continuing in folding's not possible. Client waits for upload status from server.

Hardware: Athlon 1400 TB-C, 1.4 GC/s, 512 MB DDR-RAM, Win 2000. No hardware failures since months!

dano
02-14-2004, 05:01 AM
Had a crash here too:

========================[ Feb 13, 2004 9:42 PM ]========================
Starting foldtrajlite built 2004.02.11
Fri Feb 13 22:01:55 2004 ERROR: [000.000] {foldtrajlite2.c, line 4680} File ./{handle}_1_{handle}_protein_69_0000006_min.val is corrupt, missing or has been tampered with; cannot continue - replace file and start again, or manually delete filelist.txt
Fri Feb 13 22:01:55 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: Data file checksum failed
Sat Feb 14 00:09:51 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 00:12:22 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 00:12:22 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: ° (Aesponse or unable to connect to server

Sat Feb 14 00:22:30 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 00:25:00 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 00:25:00 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: _ Fesponse or unable to connect to server

Sat Feb 14 00:25:05 2004 FATAL ERROR: [065.022] {asnio.c, line 1101} ./{handle}_1_{handle}_protein_81_0000001.valOutput
Biostruc.descr.E.< >
chemical-graph [Biostruc-graph] is not an element of .E

XP 2600+
512MB ram
Mandrake 9.2
-qf -rt -it

pharm24
02-14-2004, 09:33 AM
I did a install with dfgui - I understand not to now. I have saved the files if needed (it uploaded about 120 gens and buffered 133).

I have just made a new clean dir and started foldit.bat - after entering my handle - it is just sitting there! Shall I try later are am I missing something?

pharm24
02-14-2004, 02:03 PM
I have the beta runnin on a win2000 box AMD 2500+ @2.14ghz

Fresh install

.\foldtrajlite -f protein -n native -qf -it -rt

the client stoped with this message:

[NULL_Caption] FATAL ERROR: [000.000] Illegal file ? found in upload list
Hit Return

Progress.txt:

Building structure 6 generation 3
4 until next generation
3 generations buffered
Best Energy so far: 24.796

Error.log:

========================[ Feb 14, 2004 8:28 AM ]========================
Starting foldtrajlite built Feb 11 2004

========================[ Feb 14, 2004 8:31 AM ]========================
Starting foldtrajlite built Feb 11 2004
Sat Feb 14 08:34:04 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.

========================[ Feb 14, 2004 8:38 AM ]========================
Starting foldtrajlite built Feb 11 2004
Sat Feb 14 08:41:16 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 10:03:52 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 10:04:22 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 10:04:52 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 10:05:22 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 10:05:22 2004 ERROR: [777.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
Sat Feb 14 10:05:22 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 10:05:22 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: No response or unable to connect to server

Sat Feb 14 10:09:08 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 10:09:38 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 10:10:08 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 10:10:38 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 10:10:38 2004 ERROR: [777.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
Sat Feb 14 10:10:38 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 10:10:38 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: 8ž8žnse or unable to connect to server

Sat Feb 14 10:13:56 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 10:16:26 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 10:16:26 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: °žÐlqnse or unable to connect to server

Sat Feb 14 10:16:49 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file found in upload list

jkeating
02-14-2004, 03:23 PM
Beta crashed on 1800+ AXP running WinXP. When I try to restart, it says: "Uploading fileset 1/4 to server...", but it just hangs there without uploading.

Here is the error log:

========================[ Feb 13, 2004 11:45 AM ]========================
Starting foldtrajlite built Feb 11 2004
Fri Feb 13 17:35:15 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 17:58:20 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: NO_STATUS_FOUND
Fri Feb 13 20:40:04 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:04:53 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:04:53 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: No response or unable to connect to server

Fri Feb 13 23:09:15 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:11:46 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:11:46 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: °6
Fri Feb 13 23:16:21 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:18:52 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:18:52 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: °6
Fri Feb 13 23:22:43 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:25:14 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:25:14 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: °6
Fri Feb 13 23:29:14 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:32:06 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:32:06 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: No response or unable to connect to server


========================[ Feb 14, 2004 7:44 AM ]========================
Starting foldtrajlite built Feb 11 2004
Sat Feb 14 07:47:16 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.

RandomCritterz
02-14-2004, 04:25 PM
Despite a log full of "Error during upload: x«x«nse or unable to connect to server" my beta is now slowly uploading its buffered gens. x«x«nse?

RandomCritterz
02-14-2004, 05:12 PM
Spoke too soon:

Sat Feb 14 15:05:53 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:08:30 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:11:11 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:13:46 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:16:21 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:18:56 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:21:32 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:24:08 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:26:43 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:29:19 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:31:56 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:34:31 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:37:07 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:39:42 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:42:18 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:42:20 2004 ERROR: [000.000] {foldtrajlite2.c, line 4355} Failed to query status for ticket anteaterbeta.blueprint.org_1076792636_28504
Sat Feb 14 15:42:20 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: Could not query status
Sat Feb 14 15:45:01 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.

40 minutes to time out, with no return traffic at all (I was watching).

TazAmdmb
02-14-2004, 11:28 PM
As copied from error.log
Win2KPro / AMD 1800/ 512MB
Sat Feb 14 15:31:19 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: STATUS 906 GENERIC STRUCTURE ERROR
Sat Feb 14 15:37:43 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:40:13 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 15:40:13 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: ¸ž8‘Ònse or unable to connect to server

Sat Feb 14 15:44:31 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file Ý found in upload list

Brian the Fist
02-15-2004, 01:41 AM
Wow, you guys dug up a can of worms real fast :thumbs:
looks like we'll have our hands full for a while yet..
I have had trouble accessing the beta server from home, and I assume it is not just me so I may need to give it a kick monday morning. You can continue to run the beta or wait now for us to look into and fix some of the reported errors and release a second beta (it may not be for a couple weeks as our lead programmer is off for a bit now..).

The only thing I would ask is please read this thread before posting a 'bug'. If someone else has already posted it, or something similar, please refrain from posting it again. This helps us keep track of the problems.

thanks for your excellent help.

erk
02-15-2004, 04:22 AM
In an nutshell from reading this thread, some others, and my own experience with 3 Linux boxen, one of the big issues seems to be that a fresh client can't start when the server is offline, but an already running client with a receipt.txt file seems to keep going fine. Perhaps a dummy receipt.txt file generated on gen 0 might be an answer?

Galuvian
02-15-2004, 04:39 AM
<-- Hasn't been able to access anteaterbeta all day...

Galuvian
02-16-2004, 12:04 PM
Similar error to what TazAmdmb posted above, but the strange 'garbage' characters are different.

Sat Feb 14 01:08:09 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 01:08:39 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 01:09:09 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 01:09:39 2004 ERROR: [777.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sat Feb 14 01:09:39 2004 ERROR: [777.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
Sat Feb 14 01:09:39 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 01:09:39 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: °«°«nse or unable to connect to server

Sat Feb 14 01:10:25 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file $¼ found in upload list


Checking filelist.txt, there was this:
.\fold_0_*handle*_5_*handle*_protein_92.log.bz2
.\*handle*_0_*handle*_protein_92_0000006.val
.\fold_0_*handle*_5_*handle*_protein_93.log.bz2
.\*handle*_0_*handle*_protein_93_0000010.val
.\fold_0_*handle*_0_*handle*_protein_94.log.bz2
.\*handle*_0_*handle*_protein_94_0000001.val
.\fold_0_*handle*_5_*handle*_protein_95.log.bz2
$¼
.\fold_0_*handle*_0_*handle*_protein_96.log.bz2
.\*handle*_0_*handle*_protein_96_0000003.val
.\fold_0_*handle*_0_*handle*_protein_97.log.bz2
.\*handle*_0_*handle*_protein_97_0000001.val
CurrentStruc 0 6 134 97 1 1 30.416 -2577.607 417.272 -569.484 11154609.000 2.550 4.900 28951.203 ----HHHHHHHHHHH---------HHHHHHHHHHHH-------------------------------------------HHHH---------HHHHHHHH------HHHH-------------------
16556d34825d2f8a0479c9af08b40bef

Edit: The $¼ also has some additonal unprintable characters in it.

Pascal
02-21-2004, 08:17 AM
Sat Feb 21 14:11:21 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to www.distributedfolding.org:80 (Unknown) {errno=No such file or directory}
Sat Feb 21 14:11:21 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
Sat Feb 21 14:11:42 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to www.distributedfolding.org:80 (Unknown) {errno=No such file or directory}
Sat Feb 21 14:11:42 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
Sat Feb 21 14:12:03 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to www.distributedfolding.org:80 (Unknown) {errno=No such file or directory}
Sat Feb 21 14:12:03 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
Sat Feb 21 14:12:03 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
Sat Feb 21 14:12:25 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to ftp.mshri.on.ca:80 (Unknown) {errno=Invalid argument}
Sat Feb 21 14:12:25 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to ftp.mshri.on.ca:80 failed: Unknown
Sat Feb 21 14:12:46 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to ftp.mshri.on.ca:80 (Unknown) {errno=Invalid argument}
Sat Feb 21 14:12:46 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to ftp.mshri.on.ca:80 failed: Unknown
Sat Feb 21 14:13:07 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to ftp.mshri.on.ca:80 (Unknown) {errno=Invalid argument}
Sat Feb 21 14:13:07 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to ftp.mshri.on.ca:80 failed: Unknown
Sat Feb 21 14:13:07 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
Sat Feb 21 14:13:07 2004 ERROR: [000.000] {foldtrajlite2.c, line 2197} Unable to check server status
Sat Feb 21 14:13:29 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
Sat Feb 21 14:13:29 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
Sat Feb 21 14:13:50 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
Sat Feb 21 14:13:50 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
Sat Feb 21 14:14:11 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
Sat Feb 21 14:14:11 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
Sat Feb 21 14:14:11 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
Sat Feb 21 14:14:32 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
Sat Feb 21 14:14:32 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
Sat Feb 21 14:14:53 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
Sat Feb 21 14:14:53 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
Sat Feb 21 14:15:14 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Unknown) {errno=No such file or directory}
Sat Feb 21 14:15:14 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
Sat Feb 21 14:15:14 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
Sat Feb 21 14:15:14 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: No response or unable to connect to server


Client crashed with a fatal exception.

Mikus
02-21-2004, 09:32 AM
Originally posted by erk
In an nutshell from reading this thread, some others, and my own experience with 3 Linux boxen, one of the big issues seems to be that a fresh client can't start when the server is offline, but an already running client with a receipt.txt file seems to keep going fine. Perhaps a dummy receipt.txt file generated on gen 0 might be an answer?
I run dial-up. If I want to start the client when not connected (or if the server is offline!), I just use '-i f' (without quotes) as a parameter. Works for me.

mikus

Galuvian
02-23-2004, 09:56 AM
Beta client has locked up twice on me now. No error messages being generated.

Happened while calculating trajectory distribution this time, not sure about the first time.

Edit:
Running win2k sp2
Dual P4 1.7GHz Xeons
2GB RAM
.\foldtrajlite -f protein -n native -qf -it -rt
No 3rd party apps related to DF running

DF crashed over the weekend, the only other thing running on that box was GIMPS dedicated to the second processor.

Brian the Fist
02-23-2004, 10:55 AM
At the risk of repeating myself,

When reporting bugs with teh beta client, please provide as much information as possible - your exact OS, amount of RAM, how to reproduce the problem you are getting, if known, any messages in the error log (and it is not necessary to post the WHOLE log if its the same error over and over...), the flags you were using in the foldit.bat (if other than the defaults), and whether you were using dfGUI or something similar, or running from the command line.

Without all this information, we have little or no chance of fixing any bugs you may report. Thanks!

Chaser
02-24-2004, 10:28 AM
no real bug. however I have this in the error.log:
Mon Feb 23 19:09:20 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: °7 (<- in the log there is a rectangle between the ° and 7)

leaded and followed by:
Mon Feb 23 19:09:20 2004 ERROR: [000.000] {ncbi_socket.c, line 1258} [SOCK::s_Connect] Failed pending connect to anteaterbeta.blueprint.org:80 (Timeout) {errno=No such file or directory}
Mon Feb 23 19:09:20 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Timeout
Mon Feb 23 19:09:20 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up

and earlier i have a message:
Mon Feb 23 18:54:58 2004 ERROR: [000.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192

i have isdn... so possibly my isdn hung up..



i also had things like this:
Mon Feb 23 12:11:16 2004 ERROR: [000.000] {foldtrajlite2.c, line 4680} File handle_1_handle_protein_8_0000007_min.val is corrupt, missing or has been tampered with; cannot continue - replace file and start again, or manually delete filelist.txt
Mon Feb 23 12:11:16 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: Data file checksum failed

however i hadn't to start at zero. it continued after restarting the client



and finally once or twice I got:
Mon Feb 23 11:31:46 2004 FATAL ERROR: [003.001] {foldtrajlite2.c, line 5816} Unable to fetch Biostruc

had to restart the client. then it folded on....





using win xp with sp1 and most updates
using dfgui 3.3 beta
athlon xp 2000+
having 512mb ram ddr 333
enough free diskspace (~400mb)
df running about 3 hours per day
switches: .\foldtrajlite -f protein -n native -qt -rt -i f -g 1

hope this report contains enough informations as wished and needed..

bwkaz
02-24-2004, 06:28 PM
Originally posted by Chaser
and finally once or twice I got:
Mon Feb 23 11:31:46 2004 FATAL ERROR: [003.001] {foldtrajlite2.c, line 5816} Unable to fetch Biostruc This happens (on the non-beta client at least...) when you delete the files that DF is using inside the temp directory.

Those files are named file*.cdx, file*.dbf, file*.fpt, and file* (with no extension). Basically I've gotten to the point where I don't delete any of those files unless DF is *not* running.

If your system has a temp directory cleaner, that could be the culprit, too.

Chaser
02-25-2004, 06:57 AM
jupp! i think you're right! i think i cleandes my temp dir ;) didn't watch for those files. at least in this version you just have to restart the client. in former times the client had to begin at zero..

@howard... why not create a temp folder in the df folder?!

Welnic
02-25-2004, 11:31 AM
I have a dual Athlon box running Mandrake 9.0. I am running one beta client normally and the other with the -if flag. On the normal client when you run with the -ut flag to upload it stays in the normal terminal mode and prints out a line for each upload that happens. With the beta client it goes to black background that you normally see when you are actually folding and then starts printing out the upload lines. These get put one under the other until it gets to the bottom of the screen. Then the next one gets added to the right of the last and then you don't see anymore. So you can only see how it is progressing for about the first 18 with my terminal size.

Brian the Fist
02-25-2004, 12:23 PM
Originally posted by Welnic
I have a dual Athlon box running Mandrake 9.0. I am running one beta client normally and the other with the -if flag. On the normal client when you run with the -ut flag to upload it stays in the normal terminal mode and prints out a line for each upload that happens. With the beta client it goes to black background that you normally see when you are actually folding and then starts printing out the upload lines. These get put one under the other until it gets to the bottom of the screen. Then the next one gets added to the right of the last and then you don't see anymore. So you can only see how it is progressing for about the first 18 with my terminal size.

Yep thats a bug - easy to fix though, thanks

Brian the Fist
02-25-2004, 12:24 PM
The most prevalent new bug then seems to be some filenames with strange characters in them (like squares and symbols). We will try to find teh source of this and fix it ASAP

hallmar
02-27-2004, 03:27 AM
Been running beta client on and off for a while and got this today, the client had no other distinct errors (other than not connecting to server), it just stopped .. It was up to gen166 at the time

Fri Feb 27 10:00:19 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file T7 found in upload list

========================[ Feb 27, 2004 11:25 AM ]========================
Starting foldtrajlite built Feb 11 2004
Fri Feb 27 11:25:32 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file T7 found in upload list

Running XP Pro sp1 with no updates (oops) :)

Only flag is -rt

Xp2100+ @ 2254
256 Geil @ 204fsb
Jetway n2pa-ultra NForce2
GF2mx

The box performs faultlessly 24/7 on DF


System is headless

Cheers

Edit / found the o/s had no updates :spank:

Chaser
02-27-2004, 08:11 AM
yesterday i lost 342 generations :( filelist tampered - my pc crashed... fuc*!!!!!!!!!!!!!

Chaser
02-27-2004, 01:09 PM
well.. restarted the client.. then i tried to upload (about 30 gens). my internet broke down. so upload couldn't be finished. now i get the following message:


Fri Feb 27 18:45:03 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Timeout
Fri Feb 27 18:45:03 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up

========================[ Feb 27, 2004 6:45 PM ]========================
Starting foldtrajlite built Feb 11 2004
Fri Feb 27 18:45:12 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file mmdb2.h60 found in upload list

the last message ist the important one!! hm... restart of the client doesn't solve the problem. usind dfgui 3.3beta, offline, quietmode, extra ram, win xp pro sp1, 512mb ram

:/






forgot to post the filelist.txt

.\fold_0_######_4_######_protein_7.log.bz2
.\######_0_######_protein_7_0000005.val
.\fold_0_######_1_######_protein_8.log.bz2
.\######_0_######_protein_8_0000002.val
.\fold_0_######_7_######_protein_9.log.bz2
.\######_0_######_protein_9_0000008.val
.\fold_0_######_4_######_protein_10.log.bz2
.\######_0_######_protein_10_0000005.val
.\fold_0_######_7_######_protein_11.log.bz2
.\######_0_######_protein_11_0000008.val
.\fold_0_######_4_######_protein_12.log.bz2
.\######_0_######_protein_12_0000005.val
.\fold_0_######_8_######_protein_13.log.bz2
.\######_0_######_protein_13_0000009.val
.\fold_0_######_9_######_protein_14.log.bz2
.\######_0_######_protein_14_0000010.val
.\fold_0_######_6_######_protein_15.log.bz2
.\######_0_######_protein_15_0000007.val
.\fold_0_######_2_######_protein_16.log.bz2
.\######_0_######_protein_16_0000003.val
.\fold_0_######_5_######_protein_17.log.bz2
.\######_0_######_protein_17_0000006.val
.\fold_0_######_1_######_protein_18.log.bz2
.\######_0_######_protein_18_0000002.val
.\fold_0_######_1_######_protein_19.log.bz2
.\######_0_######_protein_19_0000002.val
.\fold_0_######_5_######_protein_20.log.bz2
.\######_0_######_protein_20_0000006.val
.\fold_0_######_1_######_protein_21.log.bz2
.\######_0_######_protein_21_0000002.val
mmdb2.h60
.\######_0_######_protein_22_0000003.val
.\fold_0_######_1_######_protein_23.log.bz2
.\######_0_######_protein_23_0000002.val
.\fold_0_######_8_######_protein_24.log.bz2
.\######_0_######_protein_24_0000009.val
.\fold_0_######_0_######_protein_25.log.bz2
.\######_0_######_protein_25_0000001.val
.\fold_0_######_8_######_protein_26.log.bz2
.\######_0_######_protein_26_0000009.val
.\fold_0_######_0_######_protein_27.log.bz2
.\######_0_######_protein_27_0000001.val
CurrentStruc 0 3 134 27 1 1 99.990 -244.406 961.789 71.738 984772.375 0.850 1.500 250.000 -----HHHHHHHHHHH-----------HHHHHHH---------------------------------------------
-------HHHHHHHHHHHHH--------------------HHHHHHH---
6f2a51f3f8c35a7e73b4bf1726a0f584

Stardragon
03-04-2004, 01:59 PM
Originally posted by Pascal
========================[ Feb 14, 2004 8:50 AM ]========================
Starting foldtrajlite built Feb 11 2004
Sat Feb 14 08:53:04 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Sat Feb 14 08:53:06 2004 ERROR: [000.000] {foldtrajlite2.c, line 4355} Failed to query status for ticket anteaterbeta.blueprint.org_1076695936_31597

using -if -qt -rt -g10 -- for offline folding
and -u -- for uploading

Upload and continuing in folding's not possible. Client waits for upload status from server.

Hardware: Athlon 1400 TB-C, 1.4 GC/s, 512 MB DDR-RAM, Win 2000. No hardware failures since months!

This should not be the case. The upload part will indeed get aborted due to inability to query the status, but the client should continue folding locally until it is ready to upload again, at which point the receipt will be checked yet again. If this is not the case for you and the client exits, I would like to take a look at your directory when this occurs, or you could provide detailed steps as to how the bug can be reproduced.

Stardragon
03-04-2004, 02:25 PM
Originally posted by deranged128[OCAU]
I've finally got some work being uploaded, only 197 gens at this stage, but the time taken to 'verifying upload on server' is far too long. The PC I'm running this from is an XP2100+ @ 1980 with 512 MB DDR333 ram, WinXP.

I have a 512/128 ADSL connection and apart from another 5 PCs which are running DF exclusively there is no other network traffic. The time though to verify these uploads is really putting a dent in the production time. Total upload time for each gen is around 50 seconds so for 197 gens I'm looking at 2.75 hours to upload with nothing else happening.

If this new back end verifies each upload and it is taking this long with just a few beta testers, what is it going to be like when the whole system is doing the same thing?

The alternative, as I see it, is for the upload process to run concurrently with folding. ie when uploading the client continues to fold, alleviating some very substantial down time.

Cheers,

Barry

The individual verification is necessary to avoid any disruptions in the consecutive upload of generations. This is in place to prevent previously occurring errors that were related to client-server timeouts without the proper communication. Keep in mind that we are only testing with one server machine, which slows down the individual upload time when many users are all uploading at the same time.

If enough people are interested, it may be possible to add a flag to supress the verification messages, which may marginally speed up the process - for now you can try uploading in quiet mode.

Stardragon
03-04-2004, 04:04 PM
Originally posted by jkeating
Beta crashed on 1800+ AXP running WinXP. When I try to restart, it says: "Uploading fileset 1/4 to server...", but it just hangs there without uploading.

Here is the error log:

========================[ Feb 13, 2004 11:45 AM ]========================
Starting foldtrajlite built Feb 11 2004
Fri Feb 13 17:35:15 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 17:58:20 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: NO_STATUS_FOUND
Fri Feb 13 20:40:04 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:04:53 2004 ERROR: [010.003] {taskapi.c, line 1218} [ReadServerResponse] Timeout waiting for response, got 0 chars.
Fri Feb 13 23:04:53 2004 ERROR: [000.000] {foldtrajlite2.c, line 4953} Error during upload: No response or unable to connect to server
.....



You should not be getting the status_not_found_error, as thise simply forces a re-upload of the most recently uploaded fileset. Thanks for pointing that out, it will be fixed shortly.

As for the timeouts, as long as the server cannot be reached, the upload won't proceed and it should fold locally. When you say "it hangs", what exactly do you mean by that? Folding locally? Quitting? Stops on a verfication message and doesn't continue? And what flags were you running with when this occured?

Ned
03-06-2004, 12:35 PM
I started with a fresh copy of everything and ran with internet
disabled. After getting 37 generations, I started my line monitor
(MyVitalAgent), then dialed up ISP, and then signaled dfGUI to
upload. The line monitor displays a count of bytes sent and received
by "transaction". I'm not sure how much control information of the
various protocols are included in these counts. With the upload of a
"generation", it reported 116 bytes sent, then 486 received, then
after delay, 1883 received. Upload of 37 generations required 29
minutes. (Monitor keeps track of time connected which is another
reason I use it.)

The good thing... Beta upload gets a lot less information back
than current upload, so line speed is not an important factor
any more.
The bad thing... Upload/ verification is a lot slower than current
upload. Current upload is three generations a minute, beta is
slightly less that one per minute.

Ned...

pointwood
03-08-2004, 03:20 AM
My beta client crashed this weekend with the following error:


Sat Mar 06 07:39:38 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1729} Illegal file RAND found in upload list

This is on a standard P4 2ghz running WinXP, hasn't been OC'ed.

I have the complete folder, so if you want it I can upload it.

Ned
03-08-2004, 09:12 AM
Group the Results of Generations at the Client
++++++++++++++++
I'm assuming that you are using a full function database to store
the folding results.
With any database with full recovery capability, a large majority
of the processing is in the system overhead for the commitment of
the updates to the DB.
When you perform mulitple updates before you request DB
commitment, you save on that overhead for an individual update.
When I was involved with a benchmark/prototype for message
processing once, we found that updating of the database in
groups of 40 messages at a time was the perfect balance of
response time versus overhead.
For DF, the optimum number of updates per committment would
depend on your server hardware and software.
In your case, where the accumulated generation results would
probably be all inserted in the same place in the DB, you
wouldn't have the problem of scattered inserts and associated
lockout problems.
As an example, I estimate that if 100 units of resource were
used to insert one record in the database, then 150 units of
resource would be used to insert ten logical sequential
records in the same database as one DB commit.
That is a 660% inprovement in resource utilization!

--------------------------------
Now, at the client...
First, in fast machines, the generation processing time is small.
Even slow machines get the job done.
Second, the data uploaded per generation is small.
SOOOOO...
-------
Consider gathering 5 to 50 generations before attempting to upload
the results, even if a permanent connection exists.
Since the data uploaded per generation is small, bandwidth is not
an issue; AND the processing resources required at the server
is a major consideration.
You might want to make the number of generations user selectable
with a minumum and maximum.
In the case where internet is not available, gather the maximum
generations per upload.
You would need to retain the "upload" request capability to force
an upload at any time with the accumulated generations.
--------
During uploading of the beta, I noticed that the verification
request actually took longer that the insertion request.
That is probably because the verification request has to wait
for the commit process for the insertion request to complete,
before the verification request can start.
--------
In the beta, you are verifying the data is uploaded before
continuing.
In the case of commiting the update of many results, verifying
that the last one was inserted correctly would give that
validation.
-----------
In Summary, group the results for multiple generations at the
client and then process these sequential results as one DB
commit at the server.

The advantage to the project is the gathering of data with
far less resources...
The alternative is to throw multiple copies of hardware at it...

Ned:D

Stardragon
03-08-2004, 11:39 AM
Although your idea is good in theory, it would not apply in practice. The reason for that is simple - the upload of each generation depends entirely on trhe successful upload and processing of the preceding generation. So if we allowed the upload of 50 generations all at once, then proceeded to validate this and discoverd that the seond generation is somehow corrupted, this would only lead to a waste of resources, and to 48 useless generation we would then have to discard. Not to mention the fact that the error messages provided to the user in that case would be completely out of sync.

Most people are happy with the option to fold offline, and it is pretty simple to write a script which uploads after X number of generations is buffered.

Thanks for the suggestion though, it is good to know out users are on the lookout for improving the system :).

Stardragon
03-08-2004, 01:27 PM
Originally posted by pointwood
My beta client crashed this weekend with the following error:



This is on a standard P4 2ghz running WinXP, hasn't been OC'ed.

I have the complete folder, so if you want it I can upload it.

Yes, I would like to take a look at it if you still have it - please upload to ftp.blueprint.org - use the /incoming directory at the root level, logging in as anonymous.

tpdooley
03-08-2004, 05:18 PM
Originally posted by Stardragon
Yes, I would like to take a look at it if you still have it - please upload to ftp.blueprint.org - use the /incoming directory at the root level, logging in as anonymous.


And send them an email with the file name you uploaded, and all the system specs and details of the problem again..

pointwood
03-09-2004, 02:28 AM
What email address should that be sent to?

Anyway, it's oploaded now - the filename is "pointwood.zip".

The machine is a standard Fujutsu-Siemens P4 2Ghz with 512MB mem.

Stardragon
03-09-2004, 10:25 AM
Just drop an e-mail to trades@mshri.on.ca outlining the flags you were running with and what you were trying to do when the bug occured (e.g. buffering offline, then trying to upload the buffered results).

pointwood
03-10-2004, 03:11 AM
I tried mailing you, but it failed for some reason :(

Anyway, I run it with the mem switch.

Stardragon
03-10-2004, 10:27 AM
Please use the newly posted beta version and report any bugs in the new thread - http://www.free-dc.org/forum/showthread.php?s=&threadid=5804

Brian the Fist
03-10-2004, 12:18 PM
Ned,

rest assured we are using a 'professional' database backend, and it has been set up by people who know what they are doing, and customized for this specific job.

Chaser
03-10-2004, 01:24 PM
P.S. How about saving a filelist_?_????????_protein_???.txt with every gen? If the last filelist.txt corrupts, you can just "-purgeuploadlist 1", rename the last filelist*.txt file, and roll on, since every filelist_?_????????_protein_???.txt would be valid for its and all previous gens.

The client can repeat the above automatically until it hits a good file, or until it runs out of buffered gens. If the latter, then it just starts over automatically. Everything is, of course, recorded automatically in the error.log file. This behaviour can be set as an option; e.g. -autorecover will try to recover from crash, otherwise display error message and stop.



from this thread:
http://www.free-dc.org/forum/showthread.php?s=&threadid=5291&perpage=25&pagenumber=1

give a comment!

Chaser
03-21-2004, 04:17 AM
========================[ Mar 20, 2004 8:50 PM ]========================
Starting foldtrajlite built Mar 9 2004
Sat Mar 20 20:50:02 2004 FATAL ERROR: [002.000] {foldtrajlite2.c, line 1448} Cannot rename filelist.txt.tmp to filelist.txt - disk may be out of space

my diskspace ran out of space :( - so i lost about 120 gens... It should be handled better?!!!!!!!!!!!!!!!!!





========================[ Mar 20, 2004 5:43 PM ]========================
Starting foldtrajlite built Mar 9 2004
Sat Mar 20 17:45:27 2004 FATAL ERROR: CoreLib [002.005] {ncbifile.c, line 715} File write error

no Idea, what this was...

running xp pro sp1, amd xp 2000+, 768mb ddr, offline, extra ram

cya

Stardragon
03-26-2004, 12:21 PM
Originally posted by Chaser
[B]my diskspace ran out of space :( - so i lost about 120 gens... It should be handled better?!!!!!!!!!!!!!!!!!

There is nothing much that can be done if there is no space to write out the filelist. You should generally leave marginal amounts of space for file writing operations.

The other error was also caused by the lack of disk space.

Chaser
03-26-2004, 12:58 PM
========================[ Mar 26, 2004 1:45 PM ]========================
Starting foldtrajlite built Mar 9 2004
Fri Mar 26 13:45:52 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 1569} Upload list has been tampered with, please delete filelist.txt and try again

had a look into filelist.txt with ultraedit32 .. there was an "?" at the very bottom of the filelist.txt (in a extra line).. i removed it and suddenly it could upload again... i saved the filelist.txt and also saved the filelist.txt that then worked...
however the rest of the files are mostly uploaded

edit:
I looked at the file again via Notepad.. and whats there at the end of the file? Three ****i**squares!

Chaser
03-28-2004, 02:47 PM
*grr*

Previous generation missing :(


========================[ Mar 28, 2004 9:43 PM ]========================
Starting foldtrajlite built Mar 9 2004
Sun Mar 28 21:43:42 2004 ERROR: [000.000] {foldtrajlite2.c, line 4620} Cannot find structure from previous generation .\########_2_########_protein_102_0000002_min.val; find it manually or delete filelist.txt to continue
Sun Mar 28 21:43:42 2004 ERROR: [000.000] {foldtrajlite2.c, line 4963} Error during upload: Previous generation missing







.\fold_2_########_1_########_protein_102.log.bz2
.\########_2_########_protein_102_0000002.val
.\fold_2_########_1_########_protein_103.log.bz2
.\########_2_########_protein_103_0000002.val
.\fold_2_########_8_########_protein_104.log.bz2
.\########_2_########_protein_104_0000009.val
.\fold_2_########_9_########_protein_105.log.bz2
.\########_2_########_protein_105_0000010.val
.\fold_2_########_3_########_protein_106.log.bz2
.\########_2_########_protein_106_0000004.val
.\fold_2_########_3_########_protein_107.log.bz2
.\########_2_########_protein_107_0000004.val
.\fold_2_########_0_########_protein_108.log.bz2
.\########_2_########_protein_108_0000001.val
.\fold_2_########_1_########_protein_109.log.bz2
.\########_2_########_protein_109_0000002.val
.\fold_2_########_8_########_protein_110.log.bz2
.\########_2_########_protein_110_0000009.val
.\fold_2_########_0_########_protein_111.log.bz2
.\########_2_########_protein_111_0000001.val
.\fold_2_########_5_########_protein_112.log.bz2
.\########_2_########_protein_112_0000006.val
.\fold_2_########_4_########_protein_113.log.bz2
.\########_2_########_protein_113_0000005.val
.\fold_2_########_0_########_protein_114.log.bz2
.\########_2_########_protein_114_0000001.val
.\fold_2_########_8_########_protein_115.log.bz2
.\########_2_########_protein_115_0000009.val
.\fold_2_########_1_########_protein_116.log.bz2
.\########_2_########_protein_116_0000002.val
.\fold_2_########_2_########_protein_117.log.bz2
.\########_2_########_protein_117_0000003.val
.\fold_2_########_0_########_protein_118.log.bz2
.\########_2_########_protein_118_0000001.val
.\fold_2_########_1_########_protein_119.log.bz2
.\########_2_########_protein_119_0000002.val
CurrentStruc 2 6 134 119 1 2 41.269 -2420.163 -288.118 -667.286 12191623.000 2.650 5.100 38287.996 ----HHHHH-HHHH------------HHHHHHHHHH-------------------------------------------HHHH---------HHHHHHH---------HHHH---------HHHH----
ec640ed96d7f11cd6388c06ac2ad4803



normally running offline, xp pro, amd athlon xp 2000+, 768mb ddr ram, enough free space

Just found 2 foldtraj...exe clients running.. stopped one via taskmanager.... tried to upload.. bam: prev. gen. mission... shit!!!

don't know how 2 clients could start up at the same time.. running dfgui 3.3 beta...

cya

Stardragon
03-29-2004, 10:20 AM
Do you still have a copy of the directory after the error occured? I would like to take a look at it if it's still available.

Chaser
03-29-2004, 10:59 AM
Originally posted by Stardragon
Do you still have a copy of the directory after the error occured? I would like to take a look at it if it's still available.

Directory is still available.. Name is vakyiu6n.rar.. Shall I upload it? I deleted only the rotlin.bin.bz2, so that the file is smaller.
~13mb

Stardragon
03-30-2004, 10:22 AM
Could you please upload that directory to ftp.blueprint.org/incoming. You can log in as anonymous. Please send an e-mail to trades@mshri.on.ca when you have placed your directory there. Thank you.

Chaser
03-31-2004, 12:01 PM
Originally posted by Stardragon
Could you please upload that directory to ftp.blueprint.org/incoming. You can log in as anonymous. Please send an e-mail to trades@mshri.on.ca when you have placed your directory there. Thank you.

I uploaded it and send a mail...


an other question:
will the points computed with the beta client be transfered to the normal statistic?

Chaser
04-04-2004, 05:59 AM
again a bug!

just uploaded about 300 gens with my isdn...


========================[ Apr 4, 2004 9:52 AM ]========================
Starting foldtrajlite built Mar 9 2004
Sun Apr 04 10:02:01 2004 ERROR: [000.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192
Sun Apr 04 10:02:17 2004 ERROR: [000.000] {ncbi_http_connector.c, line 244} [HTTP] Error writing body at offset 8192

========================[ Apr 4, 2004 11:29 AM ]========================
Starting foldtrajlite built Mar 9 2004
Sun Apr 04 11:29:05 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 3345} Unable to find file ########_1_########_protein_167_0000009.val; cannot continue - replace file and start again, or manually delete filelist.txt

========================[ Apr 4, 2004 11:34 AM ]========================
Starting foldtrajlite built Mar 9 2004
Sun Apr 04 11:34:05 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 3345} Unable to find file ########_1_########_protein_167_0000009.val; cannot continue - replace file and start again, or manually delete filelist.txt

========================[ Apr 4, 2004 11:44 AM ]========================
Starting foldtrajlite built Mar 9 2004
Sun Apr 04 11:44:20 2004 FATAL ERROR: [000.000] {foldtrajlite2.c, line 3345} Unable to find file ########_1_########_protein_167_0000009.val; cannot continue - replace file and start again, or manually delete filelist.txt


that really sucks.. have rared it and will up it to the pub today. it's named 040404.rar

cya :(

Chaser
04-13-2004, 12:01 PM
Sat Apr 10 11:21:09 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
Sat Apr 10 11:21:28 2004 ERROR: [001.001] {trajtools.c, line 3641} Unable to open trajectory distribution file ########_protein_48.trj
Sat Apr 10 11:21:28 2004 FATAL ERROR: [002.003] {foldtrajlite2.c, line 5668} Unable to read trajectory distribution ########_protein_48, please create a new one

========================[ Apr 11, 2004 11:24 AM ]========================
Starting foldtrajlite built Mar 9 2004
Sun Apr 11 11:25:13 2004 ERROR: [001.001] {trajtools.c, line 3641} Unable to open trajectory distribution file ########_protein_49.trj
Sun Apr 11 11:25:13 2004 FATAL ERROR: [002.003] {foldtrajlite2.c, line 5668} Unable to read trajectory distribution ########_protein_49, please create a new one


fixed it by renaming ########_protein_50.trj into ########_protein_49.trj


will there be another betaclient? well the points generated by beta client be transferred to the normal stats?

Chaser
04-15-2004, 05:55 PM
on sunday i'll stop testing the betaclient and will go back to the normal client.. i reported three or four issues with NO feedback from your side... if there's another client i might again test it

Chaser
04-16-2004, 01:55 PM
will the beta client update automaticly?

tpdooley
04-16-2004, 02:27 PM
I'd switch to the normal client as you mentioned you were doing... and then it'll automatically update.

Chaser
04-18-2004, 07:16 AM
its really annoying:


Sun Apr 18 11:28:43 2004 ERROR: [000.000] {foldtrajlite2.c, line 4353} Failed to query status for ticket 192.168.10.102_1082246440_14766

Sun Apr 18 11:28:43 2004 ERROR: [000.000] {ncbi_socket.c, line 1173} [SOCK::s_Connect] Failed SOCK_gethostbyname(anteaterbeta.blueprint.org)
Sun Apr 18 11:28:43 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
Sun Apr 18 11:28:43 2004 ERROR: [000.000] {ncbi_socket.c, line 1173} [SOCK::s_Connect] Failed SOCK_gethostbyname(anteaterbeta.blueprint.org)
Sun Apr 18 11:28:43 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
Sun Apr 18 11:28:44 2004 ERROR: [000.000] {ncbi_socket.c, line 1173} [SOCK::s_Connect] Failed SOCK_gethostbyname(anteaterbeta.blueprint.org)
Sun Apr 18 11:28:44 2004 ERROR: [000.000] {ncbi_connutil.c, line 801} [URL_Connect] Socket connect to anteaterbeta.blueprint.org:80 failed: Unknown
Sun Apr 18 11:28:44 2004 ERROR: [000.000] {ncbi_http_connector.c, line 101} [HTTP] Too many failed attempts, giving up
Sun Apr 18 11:50:14 2004 FATAL ERROR: [013.000] {foldtrajlite2.c, line 1455} Cannot rename filelist.txt.tmp to filelist.txt - access violation


600 generations LOST"!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!