PDA

View Full Version : Client SPEED enhancement



Brian the Fist
07-20-2002, 01:25 PM
I've been thinking about this for awhile but haven't gotten around to it until now. Anyhow, with surprisingly minor code changes, I have sped up the client 200%! :shocked: Yes, that's right, twice as fast, that's not a typo.

The catch? Well, it loads the entire protein.trj into RAM at once, rather than reading one residue at a time. Thus it will require a total of roughly: 25 + 0.64N MB RAM where N is the number of residues. Thus a 200-residue protein (about the largest we will ever attempt in the near future) would suck up 143MB RAM. Still, a small price to pay for speed doubling, especially if you use your machine only for DF, or if you have 512MB RAM. Note that's EACH process, so a dual machine would need 286MB RAM. Needless to say, this is an 'optional' switch. If should speed up the screensaver as well so an option will be added there as well.

Just though I'd whet your appetites :spray: for what's coming up.

Richard Clyne
07-20-2002, 01:45 PM
:shocked: WOW, you could have knocked me down with a feather when I read that.

Nice one Brian:notworthy

When can we expect this to be available?

IronBits
07-20-2002, 02:04 PM
Me wants needs it now! ;)
Don't let Dyyryath have it until I pass him again :D
Great work Brian! :cheers:

KWSN_Millennium2001Guy
07-20-2002, 02:23 PM
I would like to beta-test it for say... 10 protein changes... before it is released to the general population or anybody else. :p :eek:

60 percent of my machines have only 128 megs, 30 percent have 256 and the remaining handful have between 512 and 4gigs. I think I would only be able to make use of it on about 10 boxes. But that is still better than nothing.

Kool! :|party|:

Ni, NI, and Ni! :D

Brian the Fist
07-20-2002, 03:23 PM
That's ashame, 2K1guy, since all 100 of our duals all have 512 MB.
It will be incorporated just as soon as I test it for a few months on our cluster - mwu-ha-ha-ha. :jester: But seiously, should be on next Tuesday.

Digital Parasite
07-20-2002, 03:56 PM
Originally posted by Brian the Fist
It will be incorporated just as soon as I test it for a few months on our cluster - mwu-ha-ha-ha. :jester: But seiously, should be on next Tuesday.

Howard, will this be an option on the client for those machines that can't handle using that much RAM? That is great for people with lots of RAM.

Jeff.

Halon50
07-20-2002, 03:56 PM
Out of curiosity, what would happen if the program overflows RAM and goes to swapfile? I've got plenty of 256MB machines in my little farm, and it seems to me they could benefit greatly from a speed increase even if it is only temporary. Will OS thrashing to disk negate or make worse any benefit from running the program in RAM?

Kosh
07-20-2002, 04:52 PM
Time to get those much needed RAM upgrades :p
You've done some really nice work with the client :thumbs:
:notworthy :notworthy :notworthy

MAD-ness
07-20-2002, 05:32 PM
Wow.

Fortunately I hate running anything under 256meg or RAM, so I should see the benefit of this one.

Great news!

Sort of nice, seeing the client speed up and not down with updates (ala SETI). ;)

Paratima
07-20-2002, 06:19 PM
Well, that's great news. However, I believe there may be a middle ground solution.

Currently, Howard, my guess is that you're allocating memory for each residue, reading in the residue and working it over, then re-allocating your work-area for the next residue before/while reading it in.

On a machine where there is enough RAM to go around, all of this doesn't actually thrash the disk at all, because protein.trj sits happily in the disk buffer. So you're really just "reading" from memory. This can be observed by just watching your hdd light. It lights up for a minute or so when the client starts, then :sleepy:.

What you will also see, if you watch the memory allocation, is some pretty frantic activity. I believe the problem is that you're spending a lot of time allocating and reallocating the memory. I suspect that a lot of the speed differences we see between OS's is a measure of the relative efficiency of their malloc libraries!

So what's the middle path? Calculate the largest buffer you need at program init and pre-allocate that amount of memory. You'll save all that time in malloc and realloc, but it won't require a huge area that a lot of boxen don't have.

Of course, you can still do it both ways, but if the switch says "Don't allocate huge", you can still allocate big-enuff and give everybody a new set of plugs, rotor, and cap. ;)

dnar
07-20-2002, 11:27 PM
Originally posted by Halon50
Out of curiosity, what would happen if the program overflows RAM and goes to swapfile? I've got plenty of 256MB machines in my little farm, and it seems to me they could benefit greatly from a speed increase even if it is only temporary. Will OS thrashing to disk negate or make worse any benefit from running the program in RAM?
I doubt the DF client data would be pushed to swap, as stuff not accessed for a long time is pushed first.

bubbadog
07-21-2002, 07:52 AM
:notworthy :notworthy :notworthy :notworthy :notworthy :notworthy :notworthy :notworthy

This is why we love you Howard! :thumbs:

Gunslinger
07-21-2002, 08:02 AM
Does 200% faster mean 3 times as many results to upload?
That will be a real pain on my existing bandwidth. :(

Will the download interval take into account the speed (or even be user settable? :cool: )


Mind you, on second thoughts it might not be an issue since I only have four machines with more than 256Mb (out of 63). :rolleyes:

TheOtherZaphod
07-21-2002, 10:17 AM
One of my favorite math police crimes:

200% faster than = 300% as fast as

Let's hope Howard is not guilty.

Digital Parasite
07-21-2002, 10:21 AM
Originally posted by Gunslinger
Does 200% faster mean 3 times as many results to upload?
That will be a real pain on my existing bandwidth. :(

Will the download interval take into account the speed (or even be user settable? :cool: )

200% faster should mean 3 times as many results to upload.

There is no download interval, do you mean upload interval? ie: how often it will upload a batch of structures?

With the current version of the DF Client there is a new -s witch you can use to set the upload interval. The lowest being 999 which means use the server default, and then you can range it from 1000 to 10000 to manage how often your client will upload structures.

Jeff.

Brian the Fist
07-21-2002, 12:12 PM
If you actually READ my post (why am I thinking broken telephone here..) it says it is 'sped up 200%, meaning twice as fast'.
Thus we'll be doubling production (roughly) no tripling.
Im not releasing the extra 100% speedup from 200% to 300% for awhile so you still have something to look forward to :crazy:

Richard Clyne
07-21-2002, 02:40 PM
Originally posted by Brian the Fist
If you actually READ my post (why am I thinking broken telephone here..) it says it is 'sped up 200%, meaning twice as fast'.
Thus we'll be doubling production (roughly) no tripling.
Im not releasing the extra 100% speedup from 200% to 300% for awhile so you still have something to look forward to :crazy:

You have to give us something to look forward to next Tuesday:jester:

Digital Parasite
07-21-2002, 05:07 PM
Originally posted by Brian the Fist
If you actually READ my post (why am I thinking broken telephone here..) it says it is 'sped up 200%, meaning twice as fast'.
Thus we'll be doubling production (roughly) no tripling.

I was so excited to see the 200% speedup that I stopped reading there and probably didn't want to think that it was really only a 100% speedup like you meant to say. :cry:

Jeff.

bubbadog
07-21-2002, 08:23 PM
Howard, don't listen to them; you may call it whatever you like as long as I generate structures twice as fast. ;) I have a few people on my team that I must thwack! :spank:

dnar
07-21-2002, 09:59 PM
Originally posted by bubbadog
Howard, don't listen to them; you may call it whatever you like as long as I generate structures twice as fast. ;) I have a few people on my team that I must thwack! :spank:
Doof, and I was looking forward to a client that would process in half the time. Oh well. :D

RipItUp
07-22-2002, 02:47 AM
I can understand the confusion on 100% and 200% but where did 300% come from ? 2x or 4x the performance but 3x the perfromance would be 150% no ?

It's all a bit mute anyhow as Brian mentioned doubling in the first post and we all knew what he meant, except for the guys who still insist that 2001 was the start of the new millenium and hadd a very lonely party with lots of uneaten jelly and mini pork pies.

:moon:


Regards

Andy

xman
07-22-2002, 06:28 AM
WIll it be equavalent to the use of RAM-disk ? I used ramdisk on my AMD 1.2GHz, but I only get about 10% reduced in time.

wirthi
07-22-2002, 06:46 AM
Originally posted by RipItUp
I can understand the confusion on 100% and 200% but where did 300% come from ? 2x or 4x the performance but 3x the perfromance would be 150% no ?


No, you are wrong about that. 100% AS fast = equally fast. 200% AS fast = twice as fast. 150% AS fast = 1.5 times as fast = right in the middle between equally and twice as fast ....

OR:

100% FASTER = twice as fast. 200% FASTER = three times as fast . 150% FASTER = 2.5 times as fast (example: 1000 strucutes before, now 2500).

So Howard actually said "three times as fast", while he meant "twice as fast" (as far as i remember)

Greets,
Wirthi

willebenn
07-22-2002, 07:22 AM
What Mr Fist didn't tell you is that
all new proteins will be 400% slower :jester:

Michael H.W. Weber
07-22-2002, 09:35 AM
Howard - excellent work!!! :cheers:

Michael.

dnar
07-22-2002, 08:10 PM
Everyone is very excited about the new cleint speed-up of 200%/2 times/3 times as fast but I must ask, will the server(s) handle the extra load of work being returned twice as often? It appears that we have not yet seen an update period where the server(s) have handled the load without issue....

Just a thought. :D

Gunslinger
07-22-2002, 08:37 PM
Ok, I bit the bullet and ordered a T1 link today. :cool: Will be a few weeks in coming, but should sort out the bandwidth issues at my end ( especially if there are only 100% more results to upload :rolleyes: )

Can't do much about the RAM situation though - I just hope that it doesn't thrash too much on the lower memory machines as that will interfere with the normal users. I did some experiments today with a simple memory allocator - problem is that the low priority active task swaps out high priority 'idle' tasks, so there is a noticable delay when the user goes active again :o

dnar
07-22-2002, 10:55 PM
Originally posted by Gunslinger
I did some experiments today with a simple memory allocator - problem is that the low priority active task swaps out high priority 'idle' tasks, so there is a noticable delay when the user goes active again :o
Your must be using Winblows. :D Try that on Linux my friend! :D

Gunslinger
07-23-2002, 04:08 AM
Originally posted by dnar

Your must be using Winblows. :D Try that on Linux my friend! :D

I would love to, unfortunately due to the nature of the technology we develop we are pure Micro$oft/Intel :bang: (the latter helps loads for DF tho' :thumbs: ).

I'm beavering away on using something different for the next generation of servers. The changes to the licensing are giving me a lot of leverage at the moment - I think they have shot themselves in the foot by being too greedy :rolleyes:

dnar
07-23-2002, 04:29 AM
Originally posted by Gunslinger

I think they have shot themselves in the foot by being too greedy :rolleyes:
M$ Greedy? No! :eek:

GNU = Good! :|party|:

RipItUp
07-23-2002, 02:50 PM
Not using the -rt parameter :-

138AA
46800 structures per day
mem useage 25MB

using -rt parameter

138AA
111600 structures per day
mem useage 115MB

:thumbs:

Does what it says on the tin.

Regards

Andy

Jodie
07-23-2002, 04:31 PM
Gosh - your machines are that memory-starved, M2kG? Wow... Sucks to be you... BWAHAHAHAHAHAHAHAHAHA! (Glad I put 256M in all the toasters. Downside is that they're all SDR. [pout])

Hmmm, looks like as soon as I can find the time to change all those boxes over to the new option - you may just be gettin' a HURTIN' my favorite nemesis! BWAHAHAHAHAHAHAHAHAHAHA!

:D :crazy: :moon: :haddock: :crazy:

KWSN_Millennium2001Guy
07-23-2002, 07:03 PM
I just placed an order for 70 256meg sticks.

RAM is dirt-cheap, and think of the performance increase the end users will get. :p :jester:

Ni!

Halon50
07-23-2002, 07:12 PM
Good idea M2k+1G...time to buy up some of those $20 128MB sticks and upgrade the farm... :)

GOLDENBALLSAINTYORK
07-23-2002, 07:49 PM
I've only got SDRAM...Whats this DDR stuff...
By the way..I couldn't find a service.config file for the -dt? command line or whatever..???
I installed Jeffs GUI v1.7 and ticked the box, and it seems to work o.k??
I know how to cover things with chocolate, but I don't mind saying I'm a bit confused...Ah well each to their own..:(
I still know that a Billion is 1,000,000,000,000 though...:|punch|:
ATB
Ian AKA GOLDENBALLS

KWSN_Millennium2001Guy
07-23-2002, 08:09 PM
DDR is just like SDRAM, but it sends data twice per clock cycle. Your chipset must support it for it to work for you. SDRAM is slightly cheaper at the moment, but the prices are jumping back and forth, so buy low, sell high.

Ni!

GOLDENBALLSAINTYORK
07-23-2002, 08:51 PM
NI!
What happened to your outing with The KWSN who say Ni (without an ! ?):crazy:
They must be a dour lot not to have risen to the infiltration!
and they don't even have a message board for taunting...:moon:
Oh well!
One looks forward to the new ammo you will install in your boxen.
Just don't hire the Wabbitt to do it!:help:
Is he still flirting with G@H??
Sad Bunny...:confused:
ATB,
GBSY

RipItUp
07-24-2002, 02:06 AM
Jodie and KWSNM2001 make me laugh. What a couple of megalomaniacs :)

Next they'll be upping the FSB speed for " the benefit of the end users " ..heh.

My XP1500 at 1476 MHz outperforms my XP1800 at 1600 Mhz due to the increased FSB.

No doubt Jodie will respond to KWSNM2001 's increased RAM by buying a 4 gallon pot of silver conductive paint for the L1's ;)

Regards

Andy

Jodie
07-24-2002, 11:23 AM
I have a tiny advantage - I don't have to care about the 'end users' 'cause there aren't any... My whole farm is devoted 100% to df... :scared:

Every AMD I have is unlocked and OC'd - which is 90+% of the farm - I *am* on the OCN team, afterall! ;)

Heya, M2kG - did you add more machines a couple weeks ago? Just curious...

And I'm not a megalomaniac - I just need to win. BWAHAHAHAHAHAHHHAHAHAHA! ;)

TheOtherZaphod
07-24-2002, 11:37 AM
General rule of thumb: Just stay out of their way, and nobody gets hurt.

RAM prices aren't nearly as "rock bottom" as they have been. Fortunately most of my machines were already 256MB.

willebenn
07-24-2002, 01:00 PM
Some quick and dirty tests show some interesting results I think.
Both systems using W2k, client switches -df -qt -g 100 -it
3 hour runs stopped and rerun

Both using -rt
AMD 1700xp @1463 82100/day
Intl P4 1.6a @1.6 65500/day

Both using -rf
AMD 1700xp @1463 42000/day
Intl P4 1.6a @1.6 43000/day

Previous protein
AMD 1700xp@1463 118000/day
Intl P4 1.6a@1.6 132000/day

Note the larger gain using -rt with the AMD
Anybody else see this?

AMD users be happy I guess.
:cheers:

P4 users :bang:

Kosh
07-24-2002, 01:22 PM
What type of RAM are you using with the P4?

I've noticed that my boxes with sdram aren't keeping up with my single (:() ddr box.

Jodie
07-24-2002, 01:24 PM
The P4 is *really* crippled by memory bandwidth in just about everything.

The 533 DDR is better - the RD800 is MUCH better - but I've not profiled in DF - just in compiling and in video compression/image processing.

We've done substantial profiling in real-world applications for 4th gen settop boxes.

From web-surfing to video compression/decompression, realtime image processing, rich-text editing - you know - the kinda stuff you would do with a computer-vis-television... The P4 just gets its little booty stomped by the AMD in ever conceivable way... (yes - that's SSE2 versus SSE on the AMD. We've not optimized to 3DNow2 yet)

The depressing thing is that the P3/512k often humiliates a northwood...

Digital Parasite
07-24-2002, 01:34 PM
Very interesting I was just going to post a message about my observations with the P4 and the new speed increase.

All my P3 machines seemed to have doubled in speed with the new RAM option. But my P4 (1.5 GHz) using RDRAM (PC800) went from 40413 structs/day with -rf to 60909 struct/day with -rt so only about a 50% increase in speed.

Jeff.

willebenn
07-24-2002, 02:14 PM
Sorry, the memory on both systems was DDR, P4 was CL2.
As for the P4 being crippled, see the numbers from the previous protein, quite respectable vs 1700xp yes?( and I don't need a screamer fan and 10ton airconditioner to keep it cool).
Don't want to start a cpu war here just wanted to see if anyone else saw this or just my systems or if there was/is a problem.
It seems Jeff is seeing the same thing, only 50% increase with P4.

baja27
07-24-2002, 05:23 PM
Willeben
yes ,now my AMD tbred 2200 @1880 whit fonction -rt is more quickly of my p4@2652mhz.
:|party|:

dtsang
08-11-2002, 08:55 AM
Using the -rt switch on the latest protein, I can fold 33,000 proteins in one day.

I am running a Power Mac G4 with a PowerPC 7400 chip clocked at 466 MHz. I have 384 MB of RAM.

dtsang
08-14-2002, 11:11 PM
Is it just me, or is this new protein a bit slower ?

FoBoT
08-14-2002, 11:16 PM
s l o w e r

StandrdDev
02-08-2003, 11:59 PM
Just was wondering if this ever panned out.

StandrdDev
02-09-2003, 12:05 AM
Just was wondering if this ever panned out.

tpdooley
02-09-2003, 06:32 AM
Yep.. we got a client that you can turn the "use extra ram" switch on, and if you have at least 256Megs, it puts it to use, and folds almost twice as fast as the client without the switch.

That's why we were all running around adding extra ram to our machines... :)

FoBoT
02-10-2003, 07:10 PM
actually, depending on the OS and how you use the box, you don't really need 256MB

if you run a "lite" OS, like many linux versions or Windows 95/98 on boxes with 128MB AND it is just a cruncher (nobody useing the computer for non-DC stuff) then the -rt switch can also be successfully employed for 2X production

Windows XP is too fat, even if the box is only a DC cruncher, there isn't enough of the 128MB left over to use -rt

IronBits
02-10-2003, 08:08 PM
I belive the rule of thumb is 96mb for the OS (w2k/XP) +8mb per application you want open all at the same time... with only 128mb, you barely have enough for the basics... ;)