PDA

View Full Version : NOT Having a Great Time



Paratima
07-16-2002, 01:54 PM
NOT Having a Great Time
...with this update. :mad:

How about you?

Brian the Fist
07-16-2002, 02:05 PM
Would you care to elaborate?

wirthi
07-16-2002, 02:07 PM
I have problems here too. Download is just about 2 KB/sec, compared to full usage of my DSL line during the last few updates.

I guess this is a problem with the new mirrors ....

Greets,
Wirthi

Paratima
07-16-2002, 02:12 PM
Did a full file download of the icc version for Linux. Got the 908(?) Incorrect Protein error and it stopped... but didn't die. So I killed it manually (-9), removed the offending bz2 files. Now it just says "stopped", and it is, but it's not running either, and has to be killed manually.

Downloaded the gcc version. Same results.

Meanwhile, not a single Windoze machine has successfully upgraded.

:help: :help: :help: :help: :help: :help: :help:

Paratima
07-16-2002, 02:16 PM
Windoze machines (NT & W2K mix) are getting:

========================[ Jul 16, 2002 1:43 PM ]========================
ERROR: [010.003] {taskapi.c, line 1199} [ReadServerResponse] Timeout waiting for response, got 1103760 chars.

and then shutting down. :help:

PY 222
07-16-2002, 02:24 PM
I am having the same problem as the rest of the folks here. None of my clients are updated as of right now and I didn't even do anything. Did not press 'Q', nada. I just allowed the client to do its business.


Wham... died :swear: I think it must be the bandwidth problem again! If this continues with this large user base of over 10k, I think you are going to have a REALLY big problem on your hands pretty soon. :help:

vsemaska
07-16-2002, 03:06 PM
As for the '908 INCORRECT PROTEIN' error, I posted an entry in thr Bug Tracking section. I got this error only when I used the -qt option. Took that option out and the problem went away.

This happens in NT & Tru64 UNIX. I suspect it's in all versions.

Vic

Brian the Fist
07-16-2002, 03:44 PM
As I already mentioned earlier, the bandwidth problem has not been corrected for this update. I made sure it is a smaller protein though to ease the download. The mirrors etc. will be effective (if all goes well) with the NEXT update (we had to get this update to you first so it know to look for the mirrors, that is). We will also be increasing our own bandwidth hopefully in time for next week.

Sorry for the inconvenience but just try again in an hour or let it try by itself (i.e. don't touch that Q button) and it should get the update fine.

Paratima
07-16-2002, 03:48 PM
I have 12 Winders boxen that have STOPPED all by themselves. No 'Q', no nuttin! :swear:

Also the aforementioned Linux boxen... THAT problem ain't bandwidth!!! Should I download the files again or what?

TheOtherZaphod
07-16-2002, 03:51 PM
It looks like I am getting results stacked up on the client machinnes, but not uploaded. Hopefully that means that all will work itself out...

I downloaded a fresh copy of the client after the update and distributed it by hand to my home farm (12 win... machines). Doing the update that way seemed a bit better than suffering thru the auto-update process for each one.

Brian the Fist
07-16-2002, 03:51 PM
Well why did they stop then? Is there an error in the error.log indicating why perhaps? Post them here for me to see if there is any. What happens if you start then up again by typing 'foldit', does it say 'an update is available' etc etc, and if so just say 'y' and it should get it OK by now, the traffic isnt too bad right now.

TheOtherZaphod
07-16-2002, 03:53 PM
Here is what is in one of my error logs:

========================[ Jul 16, 2002 1:38 PM ]========================
ERROR: [001.000] {foldtrajlite.c, line 4351} potcharmm -828.070164 eefworst -952.800000 rmsdval 78.017827
ERROR: [001.000] {foldtrajlite.c, line 4351} potcharmm -845.907339 eefworst -952.800000 rmsdval 74.435193
ERROR: [001.000] {foldtrajlite.c, line 4351} potcharmm -818.183293 eefworst -952.800000 rmsdval 62.101111
ERROR: [001.000] {foldtrajlite.c, line 4351} potcharmm -877.551630 eefworst -952.800000 rmsdval 82.310259
ERROR: [001.000] {foldtrajlite.c, line 4351} potcharmm -850.647470 eefworst -952.800000 rmsdval 77.086631
ERROR: [001.000] {foldtrajlite.c, line 4351} potcharmm -897.869639 eefworst -952.800000 rmsdval 78.883848
ERROR: [001.000] {foldtrajlite.c, line 4351} potcharmm -875.606230 eefworst -952.800000 rmsdval 89.246576
ERROR: [001.000] {foldtrajlite.c, line 4351} potcharmm -870.446439 eefworst -952.800000 rmsdval 73.627535
ERROR: [001.000] {foldtrajlite.c, line 4351} potcharmm -816.532227 eefworst -952.800000 rmsdval 72.585374

========================[ Jul 16, 2002 3:49 PM ]========================

Digital Parasite
07-16-2002, 04:06 PM
Howard: Do you also get the '908 INCORRECT PROTEIN' error if the server is overloaded?

I am running the new DF client and protein and I get this error on a Win2k box when trying to upload the results that I have so far (this is at 4pm EDT).

Jeff.

Paratima
07-16-2002, 04:32 PM
Manually restarting stopped winders clients. Still having a problem with Linux client(s).

I'm running SuSE 7.3, which until now has run either Linux client just fine. I downloaded and tried to run both the Intel and the gcc versions. (Serially, not simultaneously.) Both just sit there. Program in memory doing nothing. Blank entry in error.log.

Progress.txt never gets created. Deleting foldtrajlite.lock does NOT stop the client.

Should I download again? Help help!

ulv
07-16-2002, 04:41 PM
8 text clients on W2K, all autoupdated correctly:D

Paratima
07-16-2002, 04:45 PM
Terje, were they running as service or freestanding?

ulv
07-16-2002, 04:52 PM
Paratima: Freestanding- all of them.

stappel
07-16-2002, 04:58 PM
Until now I'm unable to update any results. Three machines have '908 INCORRECT PROTEIN' errors when they try to upload. They all have already the new client (structure size = 85).

PY 222
07-16-2002, 05:16 PM
Just an update. After several hours, I manage to get all my boxes up and running. All Windows boxes (only 3 :D) updated with a restart of the client, after all the initial "Unable to connect" error message!

Me now happy :cheers:

But seriously, if this continues on to August, it will be ugly :bang: Brian, I hope that everything will run smoothly on the next update.

Great job. Keep it up :thumbs:

Brian the Fist
07-16-2002, 05:27 PM
If you get the 908 error AFTER updating (which should happen relatively rarely) then you should delete the filelist.txt file and all will be fixed.

Crossroads
07-16-2002, 06:11 PM
Originally posted by Brian the Fist
If you get the 908 error AFTER updating (which should happen relatively rarely) then you should delete the filelist.txt file and all will be fixed.

I downloaded the latest program. I deleted the filelist.txt, but i still get this error. 5 Hours wasted till now :mad: Worked fine last week. I use the commandclient on win98se.

Crossroads
07-16-2002, 06:42 PM
Originally posted by Crossroads


I downloaded the latest program. I deleted the filelist.txt, but i still get this error. 5 Hours wasted till now :mad: Worked fine last week. I use the commandclient on win98se.

I did a clean install. That seems to help.

Crossroads
07-16-2002, 06:52 PM
Originally posted by Crossroads


I did a clean install. That seems to help.

No that did not help either. I quit and return to eccp untill this is solved.

Crossroads
07-16-2002, 07:55 PM
Originally posted by Crossroads


No that did not help either. I quit and return to eccp untill this is solved.

After further testing, i found were the problem is. If i run it open (not quiet) i can upload. If i run it quiet i cannot anymore. Not even when i run it open again after that.

KWSN_Millennium2001Guy
07-16-2002, 07:55 PM
Originally posted by Brian the Fist
If you get the 908 error AFTER updating (which should happen relatively rarely) then you should delete the filelist.txt file and all will be fixed.

I shutdown the client on all the machines in my farm. I deleted the filelist.txt. I also deleted all of the orphaned fold*.bz2 and yh*.bz2 files. I restarted the client on all of the machines. Then for kicks I spot-checked 10 of the machines. On each of those machines I deleted the foldtrajlite.lock file. I waited for the progress.txt file to go away, indicating the client has stopped. Then I manually ran the foldit.bat file. The program reports "STATUS 908 INCORRECT PROTEIN" and after a pause continues to crunch units. Starting and stopping the service results in many more bz2 files being created, but NONE are uploading. My production went to zero since the changeover.

PLEASE HELP!

KWSN_Millennium2001Guy
07-16-2002, 08:10 PM
All of my machines are W2K running the client as a service. All fail to upload when running as a service. If I manually delete the filelist.txt file and run the client in CMD terminal then the client appears to send in the result after I type "Q". Once I start the client as a service no more files will upload from that machine until after I delete the filelist.txt file.

Hope this helps.

Digital Parasite
07-16-2002, 08:11 PM
Originally posted by Brian the Fist
If you get the 908 error AFTER updating (which should happen relatively rarely) then you should delete the filelist.txt file and all will be fixed.

Howard: All my clients so far Windows and Linux are getting the 908 error. Some of them auto-updated, some of them were fresh installs but all of them are building up structures and not uploading. If I try a manual upload -u t I get the 908 error on all my clients.

It seems with my Linux client if I delete the filelist.txt file and start it using the normal foldit command line ./foldtrajlite -f protein -n native it will work fine and upload the results. If I use -qt on the command line it gives me the 908 error. If I restart the DF client with just the original parameters above it will still give me 908 errors. I can edit the filelist.txt file and remove the first two .bz2 entries which were from the client running with -qt. If I do that it will upload fine so with the Linux client it seems that -qt is screwing something up.

With the Win32 service version on both Win2k and XP it gives you the 908 error. I tried deleting the filelist.txt and all .bz2 files and restarting the service and it just queues up files. If I try to manually upload those I get the 908 error. Same thing with the Win32 text client if I use the original foldit.bat settings it will run and upload results. If I use -qt or run as a service it creates files that give me a 908 error.

Jeff.

Brian the Fist
07-16-2002, 08:58 PM
I fixed this now. The problem was only for Windows Service and Quiet mode which is why I didn't find the bug earlier (i.e. before release) (unfortunately I don't have the resources to test all possiblities with every release :( ). No data is lost if you didn't erase it yourself, your data should now all upload correctly.
:|party|:

PY 222
07-16-2002, 09:05 PM
Originally posted by Brian the Fist
I fixed this now. The problem was only for Windows Service and Quiet mode which is why I didn't find the bug earlier (i.e. before release) (unfortunately I don't have the resources to test all possiblities with every release :( ). No data is lost if you didn't erase it yourself, your data should now all upload correctly.
:|party|:

Does this mean that if we run or client in Quiet mode or as a Service, then we must download another client or does the autoupdate takes care of it?

Brian the Fist
07-16-2002, 09:32 PM
There is nothing new to download, continue using the software you have, it works fine now in any mode.

KWSN_Millennium2001Guy
07-16-2002, 10:13 PM
Thanks Howard. It all seems to be sorted now. I am uploading files now. I just wish that I hadn't deleted the filelist.txt and all of the .bz2 files. I flushed 7 hours of crunching on over 100 machines trying to troubleshoot the problem.

Ni! :cheers: :|party|:

Jodie
07-16-2002, 11:00 PM
I updated half of my machines by hand earlier, saved the second half until things calm down since that half is on the ds3. If things are calm - I'm going to do them now...;)

Scoofy12
07-17-2002, 08:47 AM
I've been having problems (http://bane.free-dc.org/forum/showthread.php3?s=&threadid=1213) of my own (not related to DF) but just got around to reading this thread. Bad that there was a bug, BUT... I count 7 hours, 4 minutes from first the report of a problem until its resolution. That's not bad, eh?


Btw, linux folks pleas check my link, i still need :help:
Thanks all:)