DJP's Sneaker-Net HOWTO

**djp** · 07-02-2003, 07:36 PM

I read the Phase II FAQ's Sneaker-Net section and found it to be difficult for me to understand. The FAQ tells me clearly what files I should keep, but I really want to know what files to delete. Fortunately, my old Phase I Sneaker-Net process still works (albeit with a few modifications).

'Sneaker-Net' refers to running the client on one or more machines that never connect to the internet, then copying the work they have done (via private network, floppy disk(s), Zip Drive, or USB flash device for example) to a different machine which is on the internet, and uploading from there. I have a private network connecting most of my folding clients, so I have written a Windows 2000 batch file to run on one computer that is connected both to the Internet and to my private network. I probably should have written it in Perl for the sake of portability, but I just haven't managed to learn Perl yet. This script will gather work from each of the folding machines on the isolated network and upload it all to Toronto. I haven't documented this script well yet and I need to replace some of the "hard-coded" information that is specific to my network with variables so that the script can be easily adapted to other folks' networks. This should be ready in a day or three. I've also got a good idea how I can write scripts to Sneaker-Net via a USB storage device. This may take a week for me to get done (I've got a few other priorities.)

Without further ado...

DJP's Sneaker-Net HOWTO:

There are two basic approaches to Sneaker-Netting Distributed Folding work from offline computers to internet-connected computers for uploading: Simple and Complex. The Simple method's advantage is that it doesn't require knowledge about the work files' function or contents. Its disadvantages are some extraneous file copying and the inability to resume folding with a Client computer until the upload process is complete. The Complex method will allow a client machine to start folding without needing any feedback from the uploading machine. It requires an examination of the FILELIST.TXT file to determine which files need to be kept locally, which are copied to the uploader, and which are just deleted from the client machine. Unfortunately, I don't understand the Complex method fully yet, so that HOWTO will have to wait at least a few more days for me to study the upload process.

THE SIMPLE METHOD has three steps:

1) Stop the client and offload work files

2) Copy files to an Internet-connected machine and Upload to DistributedFolding.Org

3) Copy leftover files back to the client and resume folding

OFFLOADING
STOP the Distributed Folding Client. (Hit "Q" or see the project documentation for details on how to do this.) Sort the files in your client's DISTRIBFOLD directory by "Date Modified" and you'll see that all of the newest files are the actual work while the older files are the engine of folding. (Before folding my first unit of work, I set all of my client machines to use the -rt switch, so my FOLDIT.BAT file is always newer than the rest of the DF files but older than the work that my Distributed Folding Client has generated. If I'm copying the files manually, I select everything newer than FOLDIT.BAT and MOVE it over to my transfer media.) If I've got a whole bunch of folding machines to harvest, I'll make separate directories for each computer so that I can avoid confusing one client's work with another's. This can also be scripted:

Code:

REM the variable %handle% is my 8-character folding ID assigned by the project
REM Let's assume the transportation medium will always be drive F:
REM and all clients will use the directory c:\distribfold
REM you may need to edit the script to fit your environment.
F:
CD \
MD %COMPUTERNAME%
MOVE c:\distribfold\filelist.txt f:\%COMPUTERNAME%\
MOVE c:\distribfold\error.log f:\%COMPUTERNAME%\
MOVE c:\distribfold\fold_*.* f:\%COMPUTERNAME%\
MOVE c:\distribfold\%handle%*.* f:\%COMPUTERNAME%\

If you were to restart the client folding now, it would be as if you had a new installation. None of the work you've done would be remembered by this client, so it could not refine the structures you've already done in order to make better ones. The work your client has done between the completion of the last whole generation and the moment you stopped the Folding Client would be wasted. Resist the temptation to resume folding now.

I used the MOVE command rather than the COPY command because I sometimes want to check manually to be sure that I've copied everything over. I can do a MOVE because I trust my transportation media completely. If you are using floppy diskettes, you might want to do a COPY and then DELETE the files later when you're confident that they copied accurately.

UPLOADING
MOVE the contents of the first folder (representing the first Folding Client's work files) from your transport media into the DISTRIBFOLD directory of your internet-connected computer. Don't delete the empty directory on the transportation media. Upload the work by running

Code:

.\foldtrajlite -f protein -n native -ut

. This will upload all the completed files to the server, rebuild the filelist.txt file, and leave the files that will still be needed for more work on your Client.

Next, MOVE the work-in-progress back to the appropriate folder of your transport media for reloading. You can use the same sort of manual or scripted procedures as above, but be careful that you get the work from each computer back into its own folder. A different script will be needed and I'm not writing this one for you today. (Bug me in a week or two if you care.)

RELOADING
Take your transportation media back to the Distributed Folding Client machine(s) and move the remaining work files back to their individual distribfold directories. You may now delete empty directories and resume your folding.

You probably didn't notice that you've copied a few more files than necessary: Some of the files you copied were previously uploaded, so they aren't going to be uploaded a second time. (The Folding Client needs them as a basis for improving the current and future generations, however.) Perhaps we could have refrained from copying them if we knew more about the mechanics of the upload process. Some of the files you copied to the Internet-connected uploader got returned to the Folding Client without modification because they will be needed as a basis for improving the current and future generations. Perhaps we could have copied these files over to the uploader without deleting them from the Client and then just not bothered to copy them back. I prefer not to think about this. The files involved are only a few hundred kilobytes and I've got some room to spare on my transport media. These decisions of which files to copy in what direction are complex. That leads me to the Complex method of Sneaker-Netting, to be discussed later.

<DIGRESSION>
If you MOVEd all the work files off of your offline Distributed Folding Client boxes, then each offline client has the same information as any other. Theoretically, you can return any machine's work to any other machine without a care. I prefer to be a little bit paranoid and copy each machine's work-in-progress back to the machine that generated it. I do this just in case there is some rarely used file out there that doesn't fit my selection wildcards above so that it won't get separated from other files in its set.
</DIGRESSION>

Thread: DJP's Sneaker-Net HOWTO

Thread Tools

Rate This Thread

Display

DJP's Sneaker-Net HOWTO

Posting Permissions