my linux clients don't stop
my off line (-if) clients don't stop, i think it has something to do with the networking part, it thinks it can't find the DF server and gacks itself
i can't wait for phase II NEWS!!!!
Since the "regular" DF client is ferked up anyway with its frequent "quit to idle" mode on starting the client and on uploading, what is there to loose by releasing the Beta and putting a stop to this endless source of distraction to the project?
It is getting more and more common for folders to express feelings of wasted time crunching since the present DF is ending and the Beta DF stats won't count once it is released.
From what I seem to be seeing the Beta client is more stable than the "regular" one anyway. (Did I already mention this?)
Interest levels are waining quickly and it's not like people are just going to come flocking back to DF after they feel they have been forced to crunch other things while we wait... and wait.... and wait for the "Phase II".
Purdy Pleeeeeeeeeeeeze!
-Sid
(or at least fix the idling problem in the regular DF)
Last edited by Insidious; 05-09-2003 at 05:20 PM.
~~~~ Just Passin' Through ~~~~
my linux clients don't stop
my off line (-if) clients don't stop, i think it has something to do with the networking part, it thinks it can't find the DF server and gacks itself
i can't wait for phase II NEWS!!!!
Use the right tool for the right job!
That makes sense with what I am seeing (myself) and reading about (others)
Are you saying you haven't had any difficulty with the DF client going to idle in any of your machines?
-Sid
~~~~ Just Passin' Through ~~~~
just KNOW that when I say "no problem", anywhere-- [dial up, -i f ]something's gonna goOriginally posted by Insidious
That makes sense with what I am seeing (myself) and reading about (others)
Are you saying you haven't had any difficulty with the DF client going to idle in any of your machines?
-Sid
THWANG!!!
" All that's necessary for the forces of evil to win in the world is for enough good men to do nothing."-
Edmund Burke
" Crunch Away! But, play nice .."
--RagingSteveK's mom
I have to run a large percentage of my systems with -if or they just stop when they should be uploading. I don't want to see the new client before it's ready but the waiting is getting real hard.
Halo Jones Said:
"I have to run a large percentage of my systems with -if or they just stop when they should be uploading."
-----------------------------
Does this happen immediately or after you have generated for a while? There is a flag (-df) that allows the system to accumulate result files forever.
The readme file for production implies that this "feature" will be removed and that its behaviour will become standard in "future releases"... Based on beta 9 responses of the system terminating, I'm wondering if there is any connection...
Ned
It's not that it recognizes there is no connection and stops as it would if you had selected not to increase the max number of buffers (which was 6 I believe)
The case we are talking about is when the DF client wigs out as identified by CPU usage = 0%, but RAM is still reserved and in the case of a service install, the service is not recognized as errored out by XP service manager and restarted if you are so configured in the same XP service manager. (it just stays in this 'idling' state.)
I find that with no intervention from me, I will have no DF running on every second or third boot. To work around this problem, I added the following .bat file scheduled to run when I start my computer. It removes the .lock file, and agressivly kills the process. then, since I have configured windows to restart the foldtrajlite service on unexpected termination, after one minute DF starts and folds normally.
del c:\distribfold\foldtrajlite.lock
taskkill /f /im foldtrajlite.exe
net start "Distributed Folding Project Service"
This is a pain.... and when coupled with the fact that whether I am folding for the 'regular' or 'beta' client, my stats are going to either end or be erased...
Well, let's just say I'm not criticizing anyone who chooses to crunch something else.
Last edited by Insidious; 05-10-2003 at 06:28 PM.
~~~~ Just Passin' Through ~~~~
What I get is that it finishes 5000 structures, goes to upload and bascally dies. The service says it is still running and cannot be stopped without re-booting the computer. The .lock file can be deleted and it says that the service is "pending stop". On reboot the service can be restarted but the 5000 that it was trying to upload are lost.
The same machines set to connect=0 run forever without issue, so that's how I run them. Then, I copy the filelist.txt and its referenced files to another where I upload them.
The machines that fail are W2K but I also have an NT4 server running two instances that worked flawlessly for 6 months but now has the same symptoms so it too now runs nonet.
I have a bunch of NT4WKS machines that never show this symptom.
At home, my XP-Pro and W2K box run as text clients not services since the service installs also had the same problem.
I live with it but look forward to the Beta client resolving this issue as promised by Howard.
This has happened to me a couple of times as well. I have only just remembered in the last few days how to kill tasks started as services. You can't usually kill them with Task Manager as when you start it by right clicking on the task bar it runs under your user context. To start it under the system context you can use the AT command from a command prompt (cmd):
AT 14:13 /INTER TASKMGR
The 14:13 is the time you want it to start. The /INTER tells it you want to interact with the user. Once it starts you should be able to kill tasks started in the SYSTEM context.
I hope this helps.
Yes, you can. However, I have had instances where the client couldn't be killed even that way. It required a reboot to get it out of memory.Originally posted by rsbriggs
You can control services via the Start / Control panel / Administrative tools / Services icon (if that helps any)....
Exactly,
That is what I was trying to describe. This "Idling State" I keep mentioning is a state where the service manager (XP) has no effect. It thinks the service is running, but will not allow it to be stopped or restarted.
You must use the task manager (or the command console) to kill it.
~~~~ Just Passin' Through ~~~~
Task manager or netsvc or services.msc - doesn't matter, only a re-boot will allow it to be re-started.
It was worth a try. I have, in the past, been able to kill services using Task Manager (started using AT) that wouldn't respond to the Service Control Manager or NET STOP commands.
Bring on Phase 2!
the following commands will kill it:
del c:\distribfold\foldtrajlite.lock
taskkill /f /im foldtrajlite.exe
I promise!
-Sid
~~~~ Just Passin' Through ~~~~
I know I'll be back when Phase II starts
SId, where does one get this "taskkill" fromOriginally posted by Insidious
the following commands will kill it:
del c:\distribfold\foldtrajlite.lock
taskkill /f /im foldtrajlite.exe
I promise!
-Sid
it is a comsole command in windows XP.
just open the console and type in those two commands.
(or make a .bat file with notepad and run that)
in the console you can type taskkill /? and it will give you the syntax and options.
~~~~ Just Passin' Through ~~~~
Taskkill only works on XP
It appears this problem only occurs for Windows service. I cannot say for certain what is causing it but have my suspicions. Can someone(s) PLEASE try running in standard mode (not as a service) and tell me if any errors/crashes occur on these same boxes? I need to know the error that is occurring when it tries to upload or whatever and this is the only way to get it right now. Thanks.
Howard Feldman
I have not seen this failure when I run in the ASCII mode (non-service)
There are no errors shown in the error.log file when this happens.
If you still have the copies of my client directories I sent you some time back when I first noticed this problem, they should have all the information you need (or lack thereof)
-Sid
~~~~ Just Passin' Through ~~~~
Here's the deal. It only happens when you're running as a service:Originally posted by Brian the Fist
It appears this problem only occurs for Windows service. I cannot say for certain what is causing it but have my suspicions. Can someone(s) PLEASE try running in standard mode (not as a service) and tell me if any errors/crashes occur on these same boxes? I need to know the error that is occurring when it tries to upload or whatever and this is the only way to get it right now. Thanks.
1. Computer crashes for whatever reason, power failure, hand grenade, light breeze from the NW.
2. Computer is restarted. Service manager tries to restart the client.
At this point, foldtrajlite.lock is still there. All the files are still there, exactly as when (1) occurred. The client is in memory but not running, i.e. using no cycles, making no progress. (Brain-dead.) The client will not die from:
1. Removing the .lock file (Of course not. It's not running!)
2. "End process" thru Task Manager
3. "Stop" thru Service Manager
Only removing the .lock file and rebooting will allow resumption of useful work.
Interestingly, some portion of the time (50%?), the client comes right back up & starts processing a fresh batch.
Sid / Halo / anyone: Additions/corrections welcome!
Last edited by Paratima; 05-12-2003 at 02:46 PM.
I think I've had it die (i.e. freeze forever, not exit, etc) when not in a service installation, specifically when using the upload only option.
Note that if you use any other process to launch DF, DF can easily be killed.
This is something that I've used that works:
it does:
start DF
wait X minutes
delete lock file
wait Y seconds
kill DF process (if not exitted)
wait Y seconds
Repeat.
Team Anandtech DF!
Machines that freeze when running as a service run perfectly when service is set to connect=0. I have seen no errors on machines where the service dies which I have subsequently run as "quiet" normal text clients.
yeah, I agree that the "idling state" only occurs at times when the client is attempting to communicate with the DF server. (whether that be at start up or sometimes at uploads other than the first one.)
-Sid
~~~~ Just Passin' Through ~~~~
Hmm, so this sounds like it is getting stuck somewhere in the socket code (which is not pretty). Hmm. Still, I believe this is NOT a problem with the latest version of the software (the beta) so I think somewhere along the way this problem has been fixed. If not, I will continue to investigate...
Howard Feldman