PDA

View Full Version : Multiple instances of the folding-client?



the-mk
08-15-2003, 02:10 AM
Hi!

I've got a question:
I copied the foldit-script (foldit-offline) to run offline and edited it for running offline. And I started the client about 10 hours ago. A few minutes ago I wanted to see some progress, so I typed "top" into my bash-shell to see that the client is running. He is running, but not only one time.

Here a outtake of my "top":


PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
3600 root 0 19 28844 28M 2200 R N 99.8 5.6 0:40 foldtrajlite
3599 root 20 0 904 904 712 R 0.1 0.1 0:00 top
...
1177 root 20 0 1056 1056 844 S 0.0 0.2 0:00 foldit-offline
1178 root 20 19 28844 28M 2200 S N 0.0 5.6 471:06 foldtrajlite
2073 root 20 19 28844 28M 2200 S N 0.0 5.6 0:00 foldtrajlite


Any ideas why there are multiple instances of the client?

If I delete the foldtrajlite.lock file all instances are disappearing...

I started the client again, after some moments there were multiple foldtrajlite's again, but the client seems to be crunching as usual, no error.log...

:confused:

Hardware: AMD Thunderbird 1.2 GHz, 512 MB SD-RAM
Software: Suse Linux 8.0, cat /proc/version: Linux version 2.4.18-4GB ([email protected]) (gcc version 2.95.3 20010315 (SuSE)) #1 Wed Mar 27 13:57:05 UTC 2002

Welnic
08-15-2003, 02:20 AM
The instance with the most time is the main process that does the actual work of calculating the folds. The process currently using the processor calculates the trajectory between generations. After it is through 1178 will fire back up. This is standard behaviour for the client in linux.

the-mk
08-15-2003, 03:00 AM
Thanks!

Didn't know that before... it was just a little bit confusing that there are more than one foldtrajlite's running. And with the new protein it was the first time I realized that (because it is pretty fast)

BUT: you said that the second instance (here 3600) is the one which calculates the trajectory between generations. What about that third one (2073) with no cpu-time? (pid's of first post)

Did another "top" while crunching one generation (no trajectory generating between generations) and there are two instances of foldtrajlite's. Result at 2/50@gen 83 (new pid's because restarted client):



PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
3816 root 0 19 58348 56M 2196 R N 99.7 11.3 38:56 foldtrajlite
. . .
3815 root 20 0 1056 1056 844 S 0.0 0.2 0:00 foldit-offline
3817 root 20 19 58348 56M 2196 S N 0.0 11.3 0:00 foldtrajlite

still confused...

TheOtherPhil
08-15-2003, 05:45 AM
OT: It's not good practice btw to run progs as the root user ;)

I create a user called DF and have the client files in that users directory. I then Ctrl-Alt-F1 and login as DF. I then run the client from there and Alt-F7 back to the gui.

the-mk
08-15-2003, 06:04 AM
Originally posted by TheOtherPhil
OT: It's not good practice btw to run progs as the root user ;)

Yes, I know, but I'm just too lazy to create another user, change the rights of the distribfold-dir and start it again...

Welnic
08-15-2003, 02:01 PM
I don't know what the third instance does. I know that the multi process behavior started in phase II, it was not in phase I. There might be some other mode besides the normal folding and the trajectory that 3817 does. Looking at what I have running now I see some cases that have that process and some that don't. I have never seen that process with even 1 second of accumulated cpu time.

One thing that I do know now that worried me at first is the processes that are linked are using the same memory. In your last instance 3816 and 3817 are using together 58348, not both of them using that amount.

I just tried using the S switch while running top which shows the accumulated cpu time that was used be now defunct daughter processes. When I do that the process now shows up with some time. After watching a trajectory be calculated it appears that the process that hangs around and seems to do nothing actually calls the trajectory process.

HaloJones
08-15-2003, 02:48 PM
FWIW, I have the Windows CLI client running a single instance. I came home today and noticed it had far fewer points stashed than its sister computer. Task Manager revealed two foldtrajlite.exe processes running.

I used my DFGui to stop its foltrajlite and both stopped. The number of points stashed went from 180000 to 65000 and the filelist.txt does not include half the stashed files in the directory.

I very much doubt that it will upload properly.

:bang:

Brian the Fist
08-15-2003, 04:50 PM
For your info, the three theads are:

Master control thread - calls main thread and goes to sleep
Main thread - does the 'work'
Spawned thread - minimizes energy and computes trajectory distributions

The Main thread takes care of updating the progress bar while spawned threads operate.

Note that these are three 'threads' of the same process, Im not sure why they show up as 3 different PIDs under Linux, but it must be a quirk in the way the threadsare created. It is using the standard POSIX threads system.

bwkaz
08-15-2003, 05:48 PM
Originally posted by Brian the Fist
Note that these are three 'threads' of the same process, Im not sure why they show up as 3 different PIDs under Linux, but it must be a quirk in the way the threadsare created. It is using the standard POSIX threads system. It's the version of procps (specifically, top's and/or ps' versions) that they're running.

I'm using 3.1.11 and it doesn't happen here. When I switch over to my firewall which is still running 2.0.7, it shows up that way.

When I upgraded my home machine to procps 2.0.10 from the version 2.0.7 that installed with LFS, this behavior also went away. I later upgraded again, to 3.1.11 (which was written by somebody else) because I read somewhere that I needed it for kernel 2.6.0-testX, which I could unfortunately never get to work with my USB mouse... no big deal though.

Edit: They actually do get different PID's, though, due to the way Linux handles threads in general. This is why earlier versions of procps pick them up as separate processes. Later versions (2.0.10 and 3.1.11) filter out the PIDs that are sharing most of their memory with another PID (basically, all threads) except for one.