Results 1 to 10 of 10

Thread: Increasing the buffer size.

  1. #1
    Senior Member Richard Clyne's Avatar
    Join Date
    Dec 2001
    Location
    Fife, Scotland
    Posts
    621

    Increasing the buffer size.

    In the readme file it states "By default, if the client has generated six sets of data and not been able to upload them to our server"

    By adding the "-df" switch this will increase the amount of hard drive space used, but will increase the amount of work that can be buffered before uploading.

    I have just recently set up a couple of machines on a network using DUN Internet connection.

    Knowing how many additional file sets using the "-df" switch will give, will allow me to better estimate how long it will be before my machines will run before needing to upload.

  2. #2
    They can accumulate several weeks worth of work with -df - basically unlimited though there is a finite limit (1000 file pairs for now, but this may change)
    Howard Feldman

  3. #3
    Senior Member Richard Clyne's Avatar
    Join Date
    Dec 2001
    Location
    Fife, Scotland
    Posts
    621
    Cheers Howard,

    That's more than enough to meet my needs, unless someone brings out a 10Ghz cpu in the next couple of weeks.

  4. #4
    I haven't checked the current protein result data size but for one a while back I was no=netting and it was about 220k per .bz2 file, I can't remember if that 5,000 results of 10,000 results at the time.

    500k results that I uploaded at once was somewhere around 20 meg.

    Not horrible if you have broadband or if you dump on a regular basis, but if you try run multiple computers for a month at a time and then upload via dial-up, you might be in a world of hurt.

  5. #5
    Originally posted by Brian the Fist
    They can accumulate several weeks worth of work with -df - basically unlimited though there is a finite limit (1000 file pairs for now, but this may change)
    Could someone give a good example of how many file pairs would be generated by a 1Ghz PIII in a time frame ?
    I've got 5 duallies that sometimes I'm away from 3 weeks at a time, but often 2 weeks. All with no net access. They have been running g@h.
    Also - is it possible to run as a scheduled task under NT4 ?
    This let's my subadmins kill the process easily.
    tia

  6. #6
    Senior Member Richard Clyne's Avatar
    Join Date
    Dec 2001
    Location
    Fife, Scotland
    Posts
    621
    At a very rough estimate I would expect a 1Ghz PIII to produce a fileset every 2 hours. I am basing this on my P3 550 256K level 2 cache, completing a fileset every 4 hours.

    Might want to run a program like this ( http://www.free-dc.org/forum/showthr...=&threadid=871 ) to get a more accurate reading.

    Not sure about running as a scheduled service.

  7. #7
    I am not sure what a schedule service entails, though it will run as a service quite easily (and has a low enough priority to avoid interfering with the MS junk like outlook).

    A lot of information on things from running as a service, config options, running on a cluster, etc. are in the file "readme1st.txt" that comes with the client.

    I have (attempted) to include a copy with this post, for your viewing convenience, though I have no idea if it will work or if I am going to anger the admin Gods.
    Attached Files Attached Files

  8. #8
    Looks like we got lucky gnewbury. The following is courtesy of Dyyryath and his new Linux benchmarking script.

    How about a dual P3-1000?

    Processor 1:

    code:




    ------------------------------------------------------------
    Distributed Folding Linux Benchmark Script V1.0

    Sample Size: 309002 structures over 679913 seconds.

    Structures Per Second:0.45
    Structures Per Minute:27.27
    Structures Per Hour:1636.10
    Structures Per Day:39266.40

    Linux OS - Running Kernel Version 2.4.8-26mdksmp
    Pentium III (Coppermine) @ 993mhz (256 KB cache)
    ------------------------------------------------------------






    Processor 2:

    code:




    ------------------------------------------------------------
    Distributed Folding Linux Benchmark Script V1.0

    Sample Size: 313240 structures over 679915 seconds.

    Structures Per Second:0.46
    Structures Per Minute:27.64
    Structures Per Hour:1658.54
    Structures Per Day:39804.96

    Linux OS - Running Kernel Version 2.4.8-26mdksmp
    Pentium III (Coppermine) @ 993mhz (256 KB cache)
    ------------------------------------------------------------

    Comments (mine):

    40,000 structures/day = 8 sets of structures (5k/'set') per day. Each CPU will have its own instance (install) of the client so the 1000 set limit that Howard has temporarily implemented would apply to each install individually.

    8 (sets/day) * 21 (number of days in 3 weeks) = 168 structure sets (pairs of .log and .val .bz2 files). ~200k/set so bandwidth to upload will be a much larger concern than reaching the 1000 set limit.

    5 million structures = maximum number of accumulated structures (unless I misplaced some zeros again). I don't think you will hit that # on a single P3 CPU in any reasonable length of time (I say single because it has its own install, even if it is in a dual CPU rig).

    Last edited by MAD-ness; 05-23-2002 at 08:39 PM.

  9. #9
    Thanks MAD-ness
    Thus extrapolating 8 sets/day/GHz would let me run 1GHz machines for 120 days. This is great. Since I've cable for upload even if I had to do 250 meg for 3 weeks of running it's doable. That's only 2 floppies (LS-120's).
    How well do the results zip ? g@h normally goes down to 13% of the original size.

    As far as "sheduled service" I wrote "scheduled task", as in (on NT4)
    My Computer -> Scheduled Tasks.
    Thanks guys.

  10. #10
    If we are doing protein runs of 10 billion, you might well get 120 days, but smaller runs (such as CASP5 candidates?) will be shorter. In the long run, the size of the 'sample' will be 10 billion structures, but for CASP5 they don't have the time to do a sample that large and I don't think they need one.

    Hopefully the auto-update and other issues are resolved before we start in on CASP5 prediction.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •