running df on dummy machines

**safemode** · 09-29-2002, 11:44 AM

I have df installed on a central server and a bunch of machines that mount this and use it as their fs. I'm wondering if you can edit df so that the files it writes to are uniquely named so each of these computers can have an instance of df running using the same directory in linux. each instance could then keep track of the unique instance name it gives to it's files it writes so they'd be able to upload and download independently even though they use the same executable and config files. There's no local fs so either this can be done or they cant be used .

**Brian the Fist** · 09-29-2002, 12:16 PM

You could make multiple subdirectories and install multiple copies, one for each node. Each node can then run its own copy, on the server machine. You cannot run multiple copies from teh same directory. Also note this from the Known Bugs if using NFS:

Using over NFS, or /tmp partition on NFS disk
It has been reported by a 3rd party that if running the client on an NFS-mounted partition, or if your /tmp partition happens to be an NFS mounted partition, you need to ensure that the partition is mounted with the 'nolock' option. If you get error messages such as 'Could not open/create data file' and this situation applies to you, please try mounting with the 'nolock' option.

**safemode** · 09-29-2002, 12:57 PM

that's not much of a fix. Blah. oh well. 30 P4 2Ghz machines, only one is useful.

**pmfp** · 09-29-2002, 03:06 PM

How do you access the different machines? SSH?

If this is a cluster (which it sounds like), I can write a little script for you which will setup the correct directories, login to the different computers and then start DF in the background. I don't know how much DF writes to the hd, but it might be a good idea to set it up so that it uses more ram (it has some option for that).

**safemode** · 09-29-2002, 03:25 PM

each computer mounts the same drive. I'm not making copies of the same stuff on the same drive ... that's like 300MB of duplicate info. I can write my own scripts to automatically start and stop the process on all 30 machines. as the program stands i can only run one instance of df per df directory. As long as that's true that's all that can be done.

Even if i symlinked the executables ...that's still 30 directories with 30 symlinks for each executable and other files. It's just ugly. I dont think i'll be able to utilize the situation. No biggie.

**pmfp** · 09-29-2002, 06:20 PM

The script would have created the dirs and set up the files correctly. Another part of it would have logged in and started the program at the different machines. ANY WAY you would design this very same program, you'd still have many MBs for this, due to the results files... the only difference now would be that you had many directories. It's not like you'd really care about having these many directories, that's not the problem, as each hostname would have its own dir, but share (hardlinked) program files with the others, just not results files.

But sure, nevermind then. I don't really see the point of your whining about this... first saying "oh, it doesn't work like I want it to from the word go"... then when somebody can set it up to work ALMOST exactly like you said: "nah, it's not that big of a deal".

Note for Brian the Fist: Shouldn't the directory "issue" be pretty easily fixed by a simple output option/session marker on the file?

**safemode** · 09-29-2002, 07:22 PM

Who was whining? I was asking a question. And the difference between one directory and 30 is not "almost" the same. I dont care that they're just links or not. I was asking if it could be done in the same dir. The answer was no so I said oh well. It's not important enough to go and make a bunch of directories for so i dropped it. I can still run it on one. Plus the other computers i have.

**MAD-ness** · 09-29-2002, 08:29 PM

Not flaming anyone or anything of the sort, but in my limited experience, ALL distributed computing clients either require or recommend an unique directory for each CPU. SOme clients have SMP built in, but the perofrmance is usually a lot worse. ON a cluster, having individual installs is the only way that would make sense, atleast to me.

**safemode** · 09-29-2002, 08:44 PM

it can be done. One way would be moving the net code to a separate script/program. you run that once and the folding client on all the the computers. All the net script/program would do is wait until there's enough finished structures and then upload them. When it uploads it checks to see if an updated client is around... if so it writes to a file that all the clients read for this specific reason and then they all stop ...restart with the new client. etc etc. The thing can be done. Just most clients aren't coded with this function in mind. It's just a design choice.

**pointwood** · 09-30-2002, 06:54 AM

Of course it can be done. The question is whether it is worth the effort. I don't think it is.

**pmfp** · 09-30-2002, 11:42 AM

And what is the problem with having (e.g.):
/home/safemode/distributedfolding/hostname01/
/home/safemode/distributedfolding/hostname02/
...and so on? The script creating all the dirs, taking care of the files, starting and stopping the right clients, etc... Since you're probably smart enough to have the hostnames of 30+ machines with a name and then a steadil rising number, you wouldn't even need to fill in the hostnames, except for the leading word before the digits. It's not like you're going to have to move around in the different dirs and keep an eye at it. I really don't see the problem.

Of course, it's up to you, so please don't feel like I'm trying to force you to do this. I'm just curious.

Anyone else interested? I'll drop it otherwise.

**runestar** · 09-30-2002, 02:32 PM

Safemode,

Well the problem is that you're trying to do something that distributed computing was not generally intended to do.

The whole idea of D.C. is a host of individual machines each running its own copy of their particular client. It sounds more like you are trying to run a cluster, which the D.C. essentially is (on a massive scale), so that works out to a cluster within a cluster.

Just because something can be done, doesn't necessarily equate to should be done.

Its not exactly fair to complain about not being to run something outside of the intended parameters of the project. If it works for you and doesn't interfere with the science, all the more power to you. If it doesn't, well, do something else or don't do it. =)

All things considered, why not just run the client locally on the machines?

Best,

RuneStar½

**safemode** · 09-30-2002, 03:23 PM

I asked a question if it could be done. I wasn't complaining that it couldn't be done. You need to read the post instead of reading other peoples responses. I'm really getting tired of being referred to as complaining (this is a complaint).

Secondly, running something off a mounted fs that doesn't happen to be on the computer is not remotely running the program. It is run locally. It just so happens the file is not located locally on the computer. I'm not clustering. Each machine gets a copy of the same executable in ram, it's just that i'm not giving them a local copy on whatever local fs they have. This would be akin to clustering if any of them interacted with eachother... which they dont. And wouldn't in the scenario i mentioned. The scenario i mentioned had a passive one way communication to anything using that directory. clients using that directory would read it whenever they finished a number of structures and were going to upload and if it had the data inside telling it to restart it they would all restart independently and continue on. it's not required but it prevents each of the clients in turn downloading and upgrading the client even though the client is already upgraded on the fs.

Anyways that's all imaginary what ifs and doesn't matter.

Potentially an easier solution would be to have each of the computers mount the directory into a ramdisk ...run from there and never re-sync.

**runestar** · 09-30-2002, 03:43 PM

The systems you want to run the clients are remote aren't they? =) Also, it wasn't exactly a praise either.

Although the point about not having to download a whole new copy of the client is a valid point, generally D.C. just isn't set up that way. You might try the Daemon download. I don't really know how it works, but that might address the downloading brand new copies from the Web each time.

However, if the executable haven't changed, seems like you could just pass along the libraries instead of the whole program. This you would need to ask Howard about though.

Best,

RS½

**safemode** · 09-30-2002, 04:24 PM

no, it wasn't a praise. It was a question.

If i'm clustering i'm running things locally, and they are executing remotely. clustering is a specific range of distributed computing and I'm definitely not doing or trying to do that. I'm executing everything locally on every computer. I just dont have a permament local copy. That's the only difference from your normal old computer running df and what i'd be doing.

I think you have my physical location confused with the location of the executing program in relation to the computer using it.
Where i am doesn't matter.

**pmfp** · 10-01-2002, 09:19 AM

Heh, first a little note, I wasn't confused about your setup

[Note for those who don't know: *nix == *BSD, Linux, UNIX, and so on)

Second, what OS are you (safemode) using? I took for granted (because Howard mentioned NFS) that it was a *nix.

Third, Runestart, when using *nix, "not having to download a client" for each node is not as valid a reason as when having a win-based network. It's easily bypassed for those with unices. Perhaps one can fix it somehow with windoze too, I don't know and don't really care as that is not my area.

Fourth, safemode, if you're using a nix, let's take this from the beginning, and take it easy:
It doesn't matter that much wether or not you are logging in to the other comps via the network, if they are in a cluster, if they are just different workstations on a network, or whatever. It can be fixed rather easily without having everybody using the same dir (you still do not need to worry about maintaining the dirs), the question is: do you want it to?

Just my .02 cents.

Btw, there seems to be some people who have rather large computer farms... doubt they are stupid enough to maintain each node seperately. There must have been arrangements made for this before. Otherwise, it can still be done.

Oh well, take care.

-Martin

**bwkaz** · 10-01-2002, 11:09 AM

ramdisks

Jut thought I'd throw my $0.02 in... it seems like it'd be easy to make each computer mount a 50-meg ramdisk on /tmp, copy the DF files to a subdirectory of /tmp (from a central master location), and run it from there. If you're using Linux, your ramdisk pages can even be swapped out (if you use tmpfs), and it's possible that other *nix'es can do that too. So you won't even use much more memory than it would take to store the working set of data. The client also needs write access to /tmp (as Howard pointed out earlier), and it's generally not a good idea to do that across NFS (file locking, etc.). So make /tmp a ramdisk as well.

**Brian the Fist** · 10-01-2002, 01:15 PM

Safemode: I provided you with a solution at the very beginning of the thread. Installing 30 copies should not really be so terrible. Depending on your OS, it is about 5-10MB x 30. Consider you can get a 100GB hard drive for $200 I don't really see what the big deal is with doing this. If you are concerned about updating, auto-update ensures all machines are always up to date and the proxy daemon ensures you only need to download the update file once instead of 30 times through the internet.

**runestar** · 10-01-2002, 01:36 PM

Howard,

There is still one area of concern though, bandwidth, and that's at the local intranet. However, he could probably figure out a way to stagger it though so that all 30 computers aren't updating at the same time. All things considered, the intranet probably has a higher bandwidth so it shouldn't take as long as updating over the internet.

I just thought of something, once the clients are running, having them write to the server is going to be MORE bandwidth than keeping them on the local machines. Because there's going to be all that read and write traffic across the intranet while the structures are being created, whileas on the local machine the only thing that will get sent out is the bz2 files.

RS½

P.S. As it is, I agree with the point that 800MB isn't that much considering how cheap storage is these days.

**pmfp** · 10-01-2002, 02:49 PM

Ahem, why oh why aren't you guys listening?!

nix situation:
1. Updating the client does not need to bring down the whole internet connection. (read
2. Does writing the results to the hd really take that much power, considering that it's built to send it over a normal internet connection? If it would be a problem, one could just get a switch and put it on a subnet divided from ones other computers, but that shouldn't be necessary really. Correct me, Howard, if I'm wrong.
3. It does NOT have to take up 5-10x30 MB.
4. You do NOT have to create the dirs manually.

How?
Like I said: script first receives a setup arg, eg.:
df-script setup
When the arg "setup" is passed to the script it will automatically create the dirs according to a certain pattern of a set of workstations, e.g.:
USADCWS001
USADCWS002
(the above is just an example, for "USA office, in DC, Workstation, node 001")
...and so on. You do not have to write each hostname/dirname, since there are commands which can fix that process... unless you are eager to increase your work load.
It then symlinks the program files from e.g. /usr/local/distributedfolding/ to
/usr/local/distributedfolding/USADCWS001
after that, it will quit.

Want to update? Remove (e.g., again, of course) /usr/local/distributedfolding/ and extract your new version. Run the script with df-script setup. Voila, all done.

Want to start folding? Setup the firewalls on the nodes to accept SSH (easily installed) from the server only and then run the script with df-script start, which will in the background login to each of the nodes and start the folding from the right directory.

Want to stop folding? df-script stop,
the script cd into the right directories and then remove the lock file which will close down the clients after they have uploaded. Of course, THIS would make the internet go slower (depending on your connection), but you would have to do it anyway. You could easily set it up so that it would just upload one work load at a time though.

Since I have (this is the final time, won't do it again) offered to write the script for you, if this is too hard for you, then you are a really poor SOB.

-Martin
"Oh shit, I have to make an effort, I better run away!"

PS. I don't know much about ram disks, but to me it seems unnecessary... but what do I know.

**pmfp** · 10-01-2002, 02:52 PM

Oh, btw, I forgot one thing, which Howard pointed out earlier:
if you are concerned about updating and don't want to run two commands (rm -rf /usr/local/distributedfolding && df-script setup) except for the one needed to install the client; then
you could just simply stop the folding (which you would have to do anyway if updating) and install the update, since the symlinks are still working anyway. All dirs are automagically updated at the same time as the base dir is.

-Martin

**safemode** · 10-01-2002, 03:17 PM

my objection to a bunch of symlinked dirs was asthetics. I said it was ugly. That still stands.

It would take a lot of space if you copied it for each host, but that's not necessary.

As for the scripting, I'm perfectly capable of scripting up what's necessary in perl. In fact I already created the script to automatically upload the tarball of distribfold's client, create the dirs for all the hosts that are up. Then log into each host and jump into it's specified dir and begin folding. Then jump into each host again and put a timer up so all the machines stop folding after a given time, then remove the base dir and exit from all machines. It's not a matter of being difficult. It's just not pretty.

**runestar** · 10-01-2002, 05:27 PM

You want pretty or you want functionality? =)

I think you already figured out you have a choice of ugly, individual clients, or not at all.

RS½

Thread: running df on dummy machines

Thread Tools

Rate This Thread

Display

running df on dummy machines

Posting Permissions