PDA

View Full Version : Call for Benchmarks



ddn
07-25-2003, 08:06 PM
Requesting benchmarks:

Please submit them in this thread in this format:
Manufacturer,Processor Type,CPU Speed in Mhz,OS,Maketrj usr, Maketrj sys,Foldtr\
aj usr, Foldtraj sys

ie:
Intel,Xeon,2600,RedHat Linux 9.0,2.710,0.820,24.580,11.740

gistech1978
07-25-2003, 09:40 PM
AMD, XP 2100+, 2000, Windows XP Pro. SP1, 7.609, 0.359, 36.625, 6.891

hallmar
07-26-2003, 02:58 AM
AMD
xp2500+ Barton
2367mhz
XP Pro sp1
6.125
.344
31.516
6.000

:)

BuddhaMan
07-26-2003, 03:15 PM
Remember this is an "apples to oranges" comparison as the bench is done on the current generation of the client.

Suggestion: Someone post a zip file for Windows and a Linux tar.gz of the client install directory in it's entirety so we all bench using the same data. Include a .bat file for Windows and bash script for Linux to run the client quiet, with large mem and no net access.

wirthi
07-26-2003, 03:18 PM
Originally posted by BuddhaMan
Remember this is an "apples to oranges" comparison as the bench is done on the current generation of the client.

Suggestion: Someone post a zip file for Windows and a Linux tar.gz of the client install directory in it's entirety so we all bench using the same data. Include a .bat file for Windows and bash script for Linux to run the client quiet, with large mem and no net access. It should be enough to delete your filelist.txt for that. So, copy your client to another directory, delete its filelist, run the benchmark, delete this benchmark directory.

Doesn't the built in benchmark ignore witch generation you are at?

Brian the Fist
07-26-2003, 03:39 PM
The benchmark is always the same test and is independent of the protein being folded or client version. The only thing it depends on is hardware and the compiler and compiler flags we used to build it. (And occasionally, very low-level changes to the folding algorithm could affectthe timing slightly).

BuddhaMan
07-26-2003, 04:16 PM
Ok, I sit corrected then. :cool: I thought I had read somewhere what I wrote above.

With that being said, here's my results:


AMD, Athlon Thunderbird,1400,Win2K-SP4,12.127,1.052,54.208,19.198

Intel, Pentium 3,900,WinXP-SP1,18.547,0.911,118.100,22.763

Intel,Celeron,550,Win2K-SP4,26.422,2.031,121.422,47.344

derek
07-26-2003, 04:59 PM
Just an fyi to the OSX client maintainer : bench crashes on the OSX client under 10.2.6 (both server and the normal jag distribution). It crashes on an iBook 800MHz, dual Xserves (1.3GHz), and a 12" power book.

- derek

dano
07-26-2003, 06:26 PM
Intel,P4,1800,windows2000,14.578,0.875,60.750,23.031

Intel,P4,1800,Mandrake 9.1,4.860,1.240,49.550,20.330

Linux seems to be quite a bit faster on the same hardware.

AMD,xp 2600+,2130,mandrake 9.0,4.180,0.840,41.010,14.210

Brian the Fist
07-26-2003, 08:39 PM
Originally posted by derek
Just an fyi to the OSX client maintainer : bench crashes on the OSX client under 10.2.6 (both server and the normal jag distribution). It crashes on an iBook 800MHz, dual Xserves (1.3GHz), and a 12" power book.

- derek

hmm, I get a message in the error.log 'foldtraj returned an error 3' - is that what you are talking about?

derek
07-26-2003, 09:10 PM
Originally posted by Brian the Fist
hmm, I get a message in the error.log 'foldtraj returned an error 3' - is that what you are talking about?

Yup it sure is, I should have posted the error.log. Sorry.

- derek

ddn
07-26-2003, 11:17 PM
Sun,UltraSparcIIi 500,500,Solaris 8,15.840,1.550,225.600,5.070
Intel,Celeron 550,550,RedHat Linux 9.0,15.800,3.490,140.450,66.520
Intel,Celeron 550,550,Windows 2000 SP4,26.422,2.031,121.422,47.344
Sun,UltraSparc III 900,900,Solaris 9, 7.530,1.070,121.990,3.270
Intel,Pentium 3,900,Windows XP SP1,18.547,0.911,118.100,22.763
AMD,Athlon 1050,1050,RedHat Linux 9.0,8.530,0.860,93.400,16.570
AMD,Athlon XP 1200,1200,RedHat Linux 9.0,7.620,0.700,86.870,15.010
AMD,Athlon Thunderbird 1400,1400,Windows 2000 SP4,12.127,1.052,54.208,19.198
AMD,Athlon 1400,1400,RedHat Linux 9.0,7.140,0.680,78.900,12.220
Intel,Celeron 1700,1700,RedHat Linux 9.0,6.860,1.220,48.710,21.280
Intel,Pentium 4 1800,1800,Windows 2000,14.578,0.875,60.750,23.031
Intel,Pentium 4 1800,1800,Mandrake 9.1,4.860,1.240,49.550,20.330
AMD,Athlon XP 2000,2000,Windows XP Pro. SP1,7.609,0.359,36.625,6.891
AMD,Athlon XP 2130,2130,Mandrake 9.0,4.180,0.840,41.010,14.210
AMD,Athlon XP 2367,2367,Windows XP Pro SP1,6.125,.344,31.516,6.000
Intel,Pentium 4 2400,2400,RedHat Linux 9.0,3.210,0.800,29.290,11.920
Intel,Xeon 2600,2600,RedHat Linux 9.0,2.710,0.820,24.580,11.740



Linux is much faster for the same hardware.

I get the same error on the OSX client.http://www.uberh4x0r.org/~david/folding/graph.jpg

Grumpy
07-27-2003, 12:07 AM
That P4 2400 score seems a bit too good to be right :( :confused:

Barton 2233 Mhz Win2000 6.297, 0.172, 33.016, 6.609

ddn
07-27-2003, 12:36 AM
[root@mail distribfold]# cat /proc/cpuinfo | grep "model name" && ./foldtrajlite -bench
model name : Intel(R) Pentium(R) 4 CPU 2.40GHz
One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 3.180 0.780
Foldtraj 29.710 11.640

Grumpy
07-27-2003, 02:07 AM
:Pokes: Is that a 800 FSB 2.4 ? Red Hat 9.0 Rox..Whatever you have done to that P4, leave it be :cheers:

erk
07-27-2003, 03:47 AM
Originally posted by derek
Just an fyi to the OSX client maintainer : bench crashes on the OSX client under 10.2.6 (both server and the normal jag distribution). It crashes on an iBook 800MHz, dual Xserves (1.3GHz), and a 12" power book.

- derek

I get the same -bench problem under 10.2.6.

ERROR: [001.001] {foldtrajlite2.c, line 5053} Foldtraj returned an error 3

HaloJones
07-27-2003, 06:14 AM
One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 6.203 0.313
Foldtraj 29.891 6.031

T-Bred @ 2400MHz, WinXP

pointwood
07-27-2003, 06:59 AM
P4 2ghz


Summary
-------
Usr time Sys time
-------- --------
Maketrj 13.547 2.141
Foldtraj 58.766 19.516

Morphy375
07-27-2003, 08:46 AM
XP2700+ (non oc)

Summary
-------
Usr time Sys time
-------- --------
Maketrj 7.984 0.516
Foldtraj 42.344 6.594

rsbriggs
07-27-2003, 09:14 AM
The P4 2.4 numbers above seem a little odd to me, since they seem to indicate that the P4 outperforms both my P4 2.8 Ghz and 3.2 Ghz systems... Here are the numbers for my 2.8 system

P4 2.8 Ghz, Windowx XP:

7.719 0.844
27.547 10.500

same under RedHat 9.0

5.480 0.750
27.390 10.580


Umm. According to the docs, you probably don't want to graph the first number/first row - that deals with making trajectories between generations. You want the first number of the SECOND row, which pertains to actual folding time - the program spends the vast majority of its time in this state.

http://force-four.com/Graph2.gif

Morphy375
07-27-2003, 09:43 AM
One of my P4's:

P4 2,4GHz, 512MB,W2K Server


Summary
-------
Usr time Sys time
-------- --------
Maketrj 9.474 0.200
Foldtraj 43.533 8.773

Thought the XP2700+ should be much better.....

TheOtherPhil
07-27-2003, 10:09 AM
Intel, P4, 3500MHz, WinXP Pro SP1:

Summary
-------
Usr time Sys time
-------- --------
Maketrj 6.219 0.547
Foldtraj 22.000 7.734

TheOtherPhil
07-27-2003, 10:16 AM
Intel P4, 3640MHz, WinXP Pro SP1:

Summary
-------
Usr time Sys time
-------- --------
Maketrj 5.813 0.688
Foldtraj 21.719 6.969

AMD_is_logical
07-27-2003, 11:25 AM
Originally posted by rsbriggs
Umm. According to the docs, you probably don't want to graph the first number/first row - that deals with making trajectories between generations. You want the first number of the SECOND row, which pertains to actual folding time - the program spends the vast majority of its time in this state. The official word from Howard (Brian the Fist) on what best represents folding speed is:
If you want just one number, add the usr+sys time for foldtraj together and ignore the maketrj one. (The above quote is from: http://www.free-dc.org/forum/showthread.php?s=&threadid=3646 )

And I find that if I bench several times, this sum is much more consistant than either the usr or sys time alone.

ddn
07-27-2003, 01:40 PM
You'd save yourselves a lot of time if you'd quit whining about the P4 2400 time. The machine is an IBM xSeries, and it screams. Does it ever occur to you that some machines, at the same Mhz, might be faster than your little ricer-style consumer grade home-built hardware.

TheOtherPhil
07-27-2003, 01:46 PM
Originally posted by ddn
Does it ever occur to you that some machines, at the same Mhz, might be faster than your little ricer-style consumer grade home-built hardware.


Er, no. My homebuilt, consumer grade, ricer style P4 rocks :moon:

rsbriggs
07-27-2003, 01:58 PM
You'd save yourselves a lot of time if you'd quit whining about the P4 2400 time. The machine is an IBM xSeries, and it screams. Does it ever occur to you that some machines, at the same Mhz, might be faster than your little ricer-style consumer grade home-built hardware.

Well, if nothing else you just managed to insult EVERYONE here.

And, if you follow the instructions given by Howard to ignore the maketraj, your machine is nothing special for the range it is in - it is faster than a P2200, and slower than a P2800.

And, I'd be more than happy to compare the price tag of my sub $400 folding boxes to the price tag of your IBMx series box....

Heck - I'd be more than happy to compare the price tag of my entire 7 computer "farm" to the price tag of the IBMx.

ddn
07-27-2003, 04:20 PM
And, if you follow the instructions given by Howard to ignore the maketraj, your machine is nothing special for the range it is in - it is faster than a P2200, and slower than a P2800.


Why don't you go back and actually READ the posts. If you did, you would notice that the maketraj data is there, but is not being used for anything. You are correct that that machine is nothing special, you should tell that to the people complaining that the numbers are "odd".



And, I'd be more than happy to compare the price tag of my sub $400 folding boxes to the price tag of your IBMx series box....

The problem with this comment is that it shows that you are a wanker. If your folding boxes cost anywhere near $400, you are doing it wrong. I could build a folding box for $200, and it would have the highest possible performance.



Heck - I'd be more than happy to compare the price tag of my entire 7 computer "farm" to the price tag of the IBMx.

I'm sorry that your "farm" only consists of 7 boxen, however, if you would check IBM.com, you would notice than an IBM xSeries 305 with a P4 2.4, only costs $1400. But I'll give you a hint, the 305 isn't a folding box. So 7x400 = $2800. You could even buy 2 305's.

Sorry guys, but you gotta play with the little kids before you can play with the big boys.

rsbriggs
07-27-2003, 04:49 PM
Well, everyone here would be interested in hardware suggestions from one of the leading folders - feel free to recommend your favorite hardware configuration. I for one would listen.

Er, umm, where exactly did you say that you were in the overall stats again?

Grumpy
07-27-2003, 05:36 PM
Boy, talk about touchy :Pokes: . I was curious as the P4 speeds are definately improved over Phase I speeds compared to the AMD Cpus. With Phase I , they could not match the XPs, now the situation seems reversed. Has there been some serious Intel Optimizations or am I even more dumb ass than I thought :looney:

Welnic
07-27-2003, 05:40 PM
Originally posted by ddn
Why don't you go back and actually READ the posts. If you did, you would notice that the maketraj data is there, but is not being used for anything. You are correct that that machine is nothing special, you should tell that to the people complaining that the numbers are "odd".



The maketraj data is there, and it is being used to plot the data in your graph. A lot of people plot graphs to show useful information, it seems some people were sucked into paying attention to yours.

ddn
07-27-2003, 05:58 PM
rsbriggs: Why don't you show me where you are in the overall stats. What's your username?



The maketraj data is there, and it is being used to plot the data in your graph. A lot of people plot graphs to show useful information, it seems some people were sucked into paying attention to yours.

How do you figure it is being used to plot the data in the graph. I did not use all those fields to plot that graph. Besides, it's a very crude graph.

Grumpy
07-27-2003, 06:13 PM
Oh hush you silly person, I was curious for the reasons stated in my above post, I was not meaning to offend you :kiss: Do not embarass yourself any further, lets just make peace and get on with the results, and maybe someone can cast light on my P4 Optimization question.

derek
07-27-2003, 06:19 PM
Actually, I am curious where you guys are in the stats race? Last time I checked, I was 9th in daily production:
http://stats.zerothelement.com/cgi-bin/distributed-folding/render-users.pl?order=LastDayStruct&shading=SHADE&direction=DESC&team=-1&entries=50&color=YES

BTW, my name username is derek.

Oh course, I have only been doing it wed/thur.

I would contribute several bench marks to ddn's graph if the client wouldn't give a strange error under OSX (apparently it gives the same error under PPC Linux).

- derek

Welnic
07-27-2003, 06:22 PM
Sorry, I thought I saw a correlation between the maketraj numbers and the graph. Looking at it again I guess that it is pretty much just random heights.

Grumpy
07-27-2003, 07:09 PM
Humph, I have gone over all the records of Benchmarks from Phase I. It would appear Phase II Client must have been compiled/written to take advantage of Netburst & Or SSE 2, becuase the change in performance for P4s from Phase I to Phase II is quite large. Maybe the Compiler Howard is using is just AMD unfriendly...I think the more experience among us may shed some light on this, maybe Howard can shed some light on the subject :help:

rsbriggs
07-27-2003, 07:26 PM
Originally posted by ddn
rsbriggs: Why don't you show me where you are in the overall stats. What's your username?


I would be "CodeMonkey", 11th place on the FreeDC team, and somewhere around sixty-fifth place overall last time I looked (which was quite a while ago.) In a tight race for 9/10/11th places on the FreeDC team. I'd be more than happy to hear suggestions for high performance $200 folding boxen - you are more than welcome to PM me with the info, or to get my email address :)

I wouldn't mind putting together another 4 or 5 boxen at that price, that might be enough to move me up to team 6th or 7th place :D

tpdooley
07-27-2003, 07:27 PM
Originally posted by ddn
You'd save yourselves a lot of time if you'd quit whining about the P4 2400 time. The machine is an IBM xSeries, and it screams. Does it ever occur to you that some machines, at the same Mhz, might be faster than your little ricer-style consumer grade home-built hardware.

You could have said "The p4 2400 time is from an IBM xSeries, and it screams."

I'm on the chart as BennyRop and bouncing around in the 30's range. I've got 16-20 Windows machines running DF (depending on how many of my helpers decided to quit because of the various Phase II problems). Each of us do what we can for the project, regardless of the nature of the machines we're running. The vast majority of of the 1600? current active folders are in the single or double folding machine category (90%?).

TheOtherPhil
07-27-2003, 07:31 PM
Originally posted by Grumpy
Humph, I have gone over all the records of Benchmarks from Phase I. It would appear Phase II Client must have been compiled/written to take advantage of Netburst & Or SSE 2, becuase the change in performance for P4s from Phase I to Phase II is quite large. Maybe the Compiler Howard is using is just AMD unfriendly...I think the more experience among us may shed some light on this, maybe Howard can shed some light on the subject :help:


I think it is more to do with the 800fsb chips on the i865/i875 chipset. SSE/SSE2 doesn't provide any speedup for DF if I remember correctly.

Here's (http://www.anandtech.com/cpu/showdoc.html?i=1834) a nice article that shows just how far behind AMD have been pushed with the release of the i865/i875 chipset/ P4C combo.

ddn
07-27-2003, 07:35 PM
Maybe the Compiler Howard is using is just AMD unfriendly...I think the more experience among us may shed some light on this, maybe Howard can shed some light on the subject

At the prospect of offending anyone I haven't offended yet: Howard's code sucks. His vast experience has led to PowerPC routines that are broken (try -bench) which indicates to me even the whole PPC client is broken. The code is obviously faster on Intel hardware, because either it is optimized for that, or isn't optimized at all and just runs faster on Intel than AMD.

Look at SETI@Home for a serious computing project. The clients are uber-optimized for each hardware platform by the best people on each respective platform. Do you really think that Howard can write better routines on Sparc than a compiler geek from Sun. I already spoke of the horrid PPC code. If you all, especially Howard, would give up your unrelentingly pompous attitudes, some people with SERIOUS experience (dtj) would be willing to optimize your code.


I'd be more than happy to hear suggestions for high performance $200 folding boxen - you are more than welcome to PM me with the info, or to get my email address

I'm sure you would. If you were on Ars, I'd even tell you.

Like I said, you gotta step out of the nursery before you can play with the big boys.

Grumpy
07-27-2003, 07:53 PM
Yes, I have been wondering about the New 800 FSB/Chipset performance, but even this does not explain the increase in Phase II :confused: Have to upgrade to AMD64 Boxes :whip:

rsbriggs
07-27-2003, 08:51 PM
Originally posted by ddn


I'm sure you would. If you were on Ars, I'd even tell you.

Like I said, you gotta step out of the nursery before you can play with the big boys.

And like I asked before, just where is it exactly your name shows up in the folding stats? Unless your username is Bguinto1, Lemonsqzz, IronBits, PCZ, or Condor, you're just talk (and I know you aren't either of the last three.)

Grumpy
07-27-2003, 09:49 PM
Is that DDN who is 878 on the individual Stats ?

ddn
07-27-2003, 10:00 PM
Yup, ddn. My output is down pretty low right now, I'll have more to work with tomorrow.

My output doesn't change the fact that this project is a half-assed and most of you are jack asses.

rsbriggs
07-27-2003, 10:25 PM
Well, I guess that we all can consider the source of these insults, and be moderately amused by them now. Feel free to speak up again when you hit the top ten producers. Or top 100... Or top 250.... Or maybe even top 500....

Personally, I'd say you have singlehandedly done more to tarnish the reputation of ARS Technica than anyone else I've run across lately. And if I were really in a nasty mood, I'd make a parting personal comment, not associated with either this project or FreeDC : Go away little boy - and take your temper tantrums elsewhere. :harhar: But I'm not in a bad mood at the moment, so I won't say that...

tpdooley
07-27-2003, 10:38 PM
What exactly is wrong with you, ddn? Others of us have made suggestions to the project and managed to do so quietly and calmly, without the need to jump up and down insulting everyone that didn't share our view. Even when we've been told by Howard why our suggestions won't work at present. You're doing an incredible disservice to Ars by acting the way you are. Are there new guidelines on Ars instructing the newest members how to alienate the projects that the team is donating cycles to?

There is skill and technique in writing suggestions; as there are ways of writing constructive criticism. I may not be the best in either field; but if this isn't just an attempt at trolling, you've got a lot of studying to do before you can create something even remotely resembling a working suggestion or constructive criticism.

Brian the Fist
07-28-2003, 12:01 AM
Originally posted by ddn
At the prospect of offending anyone I haven't offended yet: Howard's code sucks. His vast experience has led to PowerPC routines that are broken (try -bench) which indicates to me even the whole PPC client is broken. The code is obviously faster on Intel hardware, because either it is optimized for that, or isn't optimized at all and just runs faster on Intel than AMD.

Look at SETI@Home for a serious computing project. The clients are uber-optimized for each hardware platform by the best people on each respective platform. Do you really think that Howard can write better routines on Sparc than a compiler geek from Sun. I already spoke of the horrid PPC code. If you all, especially Howard, would give up your unrelentingly pompous attitudes, some people with SERIOUS experience (dtj) would be willing to optimize your code.


You, like everyone else here, are entitled to your opinion. allow me to just point out that, unlike SETI, we have one programmer me (until just recently anyhow). And how many project managers? One, me. How about scientists? Well, me, and my boss Chris Hogue aka FEEDB0B0. While we compile for close to 16 distinct operating systems, we cannot possibly test everything on all of them. We rely on you, the users to inform us of such problems. If you think you can do better, perhaps you should take over this project, I could use a break.

We normally test Windows and usually Intel Linux only after making major changes (like adding the benchmark). I am always prompt to look into errors such as this one and if you'd been around here for more than a week you'd know Ill have the Mac benchmark fixed for the next update (since its not critical).

As for code optimization, the code is not optimized for any specific platform but the C code itself has been hand-optimized and profiled to achieve decent speedup from the original implementation. I do not doubt that it could be further tweaked on inidividual platforms but estimate this would achieve less than 10% improvement at best and is not worth the trouble. If someone else (such as dtj you mention) is interested in having a go at it and is willing to sign a NDA and so forth, we may be able to supply him/her with the source to tinker with (this is up to Chris as well as the Hospital). However if he is as rude as you I don't think I want to deal with him. :moon:

Grumpy
07-28-2003, 02:26 AM
And for those who want the Readers Digest version:

:spank: :spank: :spank: :spank: :spank:

We are all on the one Team if you were unaware, and for such an extremely small staffed project support/response is excellent. No further admonishment is needed, I am sure you are already feeling the heat from Senior ARS Members

Morphy375
07-28-2003, 05:46 AM
Maybe this powercruncher should go back to shool and learn how adults should behave..... :swear:

(@grumpy: I am a real StatsHO :cool: )

Grumpy
07-28-2003, 09:03 AM
Back to the Topic:

Tyan MPX Win 2K 2400 MP 1 Gig ECC

9.625 , 0.750, 65.219, 18.078 Client 1

9.605, 0.710, 65.267, 18.143 Client 2

Hmmmm, too much overhead going on here

:bs:

TheOtherPhil
07-28-2003, 11:13 AM
I tend to agree with you Grumpy....dual AMD's don't seem that efficient. Here's the fastest of my dual's:

Iwill MPX2, Red Hat 9, 2x XP2500's @ 2.3GHz, 1GB

Client1: 3.900, 0.500, 52.320, 12.7
Client2: 3.74, 0.590, 52.060, 13.550


I wonder how the Opteron will fare as each CPU will have local RAM?

pfb
07-28-2003, 11:31 AM
hmmm...

Gigabyte 7DPPXDW+, Dual MP2400+, 1GB ECC RAM, Windows XP Pro

CPU 0:


Summary
-------
Usr time Sys time
-------- --------
Maketrj 9.516 0.484
Foldtraj 55.828 7.141

CPU 1:


Summary
-------
Usr time Sys time
-------- --------
Maketrj 9.578 0.438
Foldtraj 56.109 7.594

TheOtherPhil
07-28-2003, 11:49 AM
Well, just checked my dual XP2500 @ 2.1GHz running Win2K and it appears faster than my dual 2.3GHz box on Linux :confused:

Iwill MPX2, Dual XP2500's @ 2.1GHz, Win2K Pro SP4, 512MB:

1x DF client running normally, Client2: 7.859, 0.531, 52.344, 9.703


Now, when I assign affinity to the first client to limit it to a single CPU and then run the bench again on the second CPU, I get:

7.750, 0.563, 51.594, 7.359


So it seems it may be worth assigning each client to it's own CPU on Windows machines. I don't think you can assign affinity in Linux unless running kernel 2.6.

TheOtherPhil
07-28-2003, 12:01 PM
I am just installing 2K Server on my dual 2.3GHz box for a direct comparison vs RH 9 results posted above. Back when this 40GB hd formats.....now I know why I love SCSI - IDE takes ages to format :bang:

Darkness Productions
07-28-2003, 12:53 PM
Try this. Nobody gives a shit about your opinion. We run the client because we want to. I agree, that the performance probably isn't what it could be, but it's one person working on the project. Yes, I've asked before if he'd let me have the source to compile it on the Alpha that's sitting in my living room. He declined. I respect that.

On the subject of us being jackasses, well, hello pot, this is kettle. You've made yourself out to be an ignorant asshole that nobody wants to deal with, and you did it from the start. It's really annoying to hear people complaining about you.

So, either stop whining, or, well, stop whining.


Originally posted by ddn
Yup, ddn. My output is down pretty low right now, I'll have more to work with tomorrow.

My output doesn't change the fact that this project is a half-assed and most of you are jack asses.

Richard Clyne
07-28-2003, 01:16 PM
Well, we have not had one for a while, so I suppose we were overdue for a complete tosser to come along and start talking through their arse.

So here's to "ddn" the "Tosser of the Week" Award. :weggy:

TheOtherPhil
07-28-2003, 01:32 PM
Hey ddn, the benchmark results for your Xeon 2.6.....is this with 2x clients running? I am not at all impressed with my dual AMD's (x4) for DF. If the Xeon is putting in those sort of numbers, it will be definitely worth putting my dual AMD's to rest.

pointwood
07-28-2003, 01:45 PM
ddn: Be polite or shut up.

If you aren't capable of that, I would prefer if you would leave Ars, you're giving us a bad name :swear:

TheOtherPhil
07-28-2003, 01:57 PM
OK, now this is interesting. Using the exact same machine and comparing RH9 to Win2K Server SP4:

Iwill MPX2, Red Hat 9, 2x XP2500's @ 2.3GHz, 1GB

Client1: 3.900, 0.500, 52.320, 12.7
Client2: 3.74, 0.590, 52.060, 13.550


Iwill MPX2, Win2K Server SP4, 2x XP2500's @ 2.3GHz, 1GB

Client1: 6.938, 0.406, 48.641, 5.922
Client2: 6.938, 0.406, 48.641, 5.922


What can I say? Win2K is faster for running DF on a dual AMD machine.

Scoofy12
07-28-2003, 02:06 PM
<hotheaded post deleted>

Welnic
07-28-2003, 02:25 PM
TheOtherPhil,

Maybe you could edit your post so we can see which run is actually on Windows? :) I assume it is the second one.

CodeMonkey
07-28-2003, 02:29 PM
Perhaps we can put this whole DDN thing to bed now? I think everyone has gotten the true picture, and scoofy12 probably sums it up pretty well for everyone.

On another note - I don't have the benchmarks here, but on my Intel 2.8 Ghz HT box, Linux runs DF faster than Windows, either treating it as two separate processors, or just one.

Isn't this all very odd?
AMD based boxes working better under Windows.
Intel boxes running faster under Linux.
What's the world coming to ???? :confused:

TheOtherPhil
07-28-2003, 02:33 PM
Originally posted by Welnic
TheOtherPhil,

Maybe you could edit your post so we can see which run is actually on Windows? :) I assume it is the second one.


Er yeah....good idea :o

Scoofy12
07-28-2003, 02:37 PM
perhaps that's because the linux client is compiled with the intel compiler?

ddn
07-28-2003, 02:41 PM
Why would I have started this thread to begin with if I wasn't interested in gathering some info on what runs folding the fastest? Your comments are inane. If you would read my posts, and grasp what I'm trying to tell you, you wouldn't make these helpless retorts.

I am not a "most talented" programmer, nor do I have access to them directly. However, like I said before, if Howard were to open up a bit, there are people that would be willing to optimize the code. One such person has already rescinded his offer.

I do have access to some of the world's finest hardware, albeit indirectly. But that's not going to help much in this situation anyway. No one is going to baby-sit 256 clients (512 if they are dual-proc nodes) to keep them running.

I post a tarball of a broken directory, and I am mocked for the domain it is hosted on. Wow, you really showed me how to be mature and tactful. Did you ever consider that maybe it's a play on typical script k1dd13 sp34k?

ddn
07-28-2003, 02:52 PM
First is GCC, second is Intel.

[root@server1 distribfold]# cat /proc/cpuinfo | grep "model name" -A2 | grep -v step
model name : AMD Athlon(tm) processor
cpu MHz : 1401.734
[root@server1 distribfold]# ./foldtrajlite -bench
One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 8.000 0.730
Foldtraj 70.700 13.470

[root@server1 distribfold]# ./foldtrajlite -bench
One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 7.190 0.640
Foldtraj 79.000 13.370

Scoofy12
07-28-2003, 02:56 PM
ok, i guess my post was a bit uncalled-for (i'm having a crazy day, i apologize. i will delete it even though you quoted it). i have no problem with discussions of what hardware is best, or what software approach, or criticisms of the project. or any of that. what i do have a problem with is an attitude that insults people with no apparent provocation, calls names, etc. none of that is necessary, nor is it productive. the sentiment stands, theres no need to be caustic, especially if you are trying to get people to do something for you.

ddn
07-28-2003, 06:45 PM
Originally posted by TheOtherPhil
Hey ddn, the benchmark results for your Xeon 2.6.....is this with 2x clients running? I am not at all impressed with my dual AMD's (x4) for DF. If the Xeon is putting in those sort of numbers, it will be definitely worth putting my dual AMD's to rest.

That's one client. Single processor, one client. Although you make a good point, now that you mention it I am going to go do a test with 2 -bench'es on the Xeon and see how they fare. It's possible that the Xeon could do enough in-chip to optimize the stream and do well with 2.

Here:
1x:
Maketrj 3.010 0.740
Foldtraj 25.120 11.240

2x:
-------- --------
Maketrj 5.110 1.030
Foldtraj 44.340 17.150

[root@linux1 distribfold2]# Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 4.990 1.240
Foldtraj 44.760 16.740


I guess I'll be running 2 from now on. This also does some validation of my argument that the code is not nearly as optimized as it could be.

Richard Clyne
07-28-2003, 07:06 PM
Originally posted by ddn
Why would I have started this thread to begin with if I wasn't interested in gathering some info on what runs folding the fastest? Your comments are inane. If you would read my posts, and grasp what I'm trying to tell you, you wouldn't make these helpless retorts.....

I post a tarball of a broken directory, and I am mocked for the domain it is hosted on. Wow, you really showed me how to be mature and tactful. Did you ever consider that maybe it's a play on typical script k1dd13 sp34k?


:lmao: Sorry ddn this is the only response I can think of to this post. Read you own posts and see what a total prat you are.

derek
07-28-2003, 07:58 PM
Actually the uberh4x0r domain ddn posted is my domain. If you don't get the joke - that's your loss. If you have silly attacks you can direct them to me. (Silly rabbit, quit trying to detract from the argument, especially on something aestetic.)

He asked for legit benchmarks, many people gave; several people attacked him because he is rather brash in his statements. Don't like it? Move on, and quit your bitching. Also quit whining about his results, judging from the results many of you are posting, it's none to impressive.

The project could have benefited tremendously from the gentleman (whom I know very well) who offered his help.

I would consider (maybe) donating some time to the sparc client. This is actually dependent on my work load. Perhaps all I can contribute is optimized 64 bit bins for a UltraSPARC III and UltraSPARC II. If I have time, I could even take a look at the OS X client.

Perhaps a call to arms from Howard would benefit the project. It's been said before, and its entirely true. One man can make a cavalier effort with the code, however, it will not produce the best results.

If Howard wants help on the Sparc client (or the OS X client), he can email me directly.

- Derek

Paratima
07-28-2003, 08:35 PM
Originally posted by derek
The project could have benefited tremendously from the gentleman (whom I know very well) who offered his help. What gentleman?

rsbriggs
07-28-2003, 08:40 PM
There is a distinct difference between being "brash", and saying that the entire project is "half-assed" and that everyone involved with it is a "jackass". That crosses over the line into rudeness in my book.... Enough said....

derek
07-28-2003, 09:00 PM
Originally posted by Paratima
What gentleman?

dtj. It was actually in my initial thread abotu requests to the client.

- derek

Grumpy
07-28-2003, 09:15 PM
Well, this proves my Tyan is as they say, the slowest MPX MB..Stability over Speed :sleepy: Mind you, in the real world the Duallie runs a lot faster than the Benchmark shows...

TheOtherPhil
07-29-2003, 05:49 AM
Originally posted by Grumpy
Well, this proves my Tyan is as they say, the slowest MPX MB..Stability over Speed :sleepy: Mind you, in the real world the Duallie runs a lot faster than the Benchmark shows...


Actually Grumpy, I am not convinced that it does. I am estimating that a dual AMD is something like 70% efficient for DF....if that. I'm personally running 4x dual AMD's and a P4 (~19.6GHz). 24/7 power is 2x duals and the P4 (11.8GHz). The part time dual's (~7.8GHz) run ~8hrs a day. All run as a service with useram=1.

My daily output is ~240K/ day. I really should be getting much higher than that I feel with the power I have invested in this project.

I am going to conduct a small test within the next few weeks where I remove the procs from my 2x full time dual's and run them in uni-processor boards for a while. I am expecting to see significantly higher numbers (~+30%).

Grumpy
07-29-2003, 09:47 AM
Hmmm, I would estimate my Dual 2400's are equal in output to 2 X 2100 -2200XPs. During Phase I my 2400s were both getting on a certain protein 120K a day and the 2100XP 105K, now it seems a lot slower. Phase II, as far as I can tell, is not handling AMD SMP as well :cry:

Welnic
07-29-2003, 11:42 AM
I had 4 2000MPs and 5 2100XPs during phase I and they were really close on their output. I remember when I first got the 2100s being disappointed that they didn't have a bigger advantage than they did. I don't remember the exact numbers.

I don't have any idea of how individual machines are doing during phase II. I do seem to have the wrong setup with AMD linux boxen in my farm and borged Intel XP boxen.

Grumpy
07-30-2003, 08:33 AM
Just hope the Opteron does not get slowed down like the MPX does. I am waiting for someone to post some Benches soon, PLEASE.

TheOtherPhil
08-06-2003, 04:41 PM
Intel, P4, 3668MHz, WinXP Pro SP1, 1GB

5.875, 0.500, 20.672, 7.797

Grumpy
08-06-2003, 07:01 PM
So close to breaking 20 :scared:

TheOtherPhil
08-06-2003, 07:24 PM
Originally posted by Grumpy
So close to breaking 20 :scared:


Yeah, that's what I was aiming for. I'll try a higher o/c tomorrow :)

Grumpy
08-07-2003, 03:16 AM
If the Fire Brigade aint at ya front door, yor not OCing it enough :whip:

TheOtherPhil
08-07-2003, 06:48 PM
Hmmm, deja vu.

Anyway, I managed to break the 20 secs mark with:

Intel P4, 3724MHz, WinXP Pro SP1, 1GB

5.734, 0.500, 19.688, 8.406 (Screenshot (http://homepage.ntlworld.com/phil.harling/images/dfbench.gif))

The system ran the Prime95 torture test for just over an hour before running the DF bench so it seems stable enough at that speed. The problem is the high vcore (1.8V)...it's just too high for my comfort to run 24/7. I have now dropped back to 3.5GHz and 1.65V.

Grumpy
08-10-2003, 02:20 AM
Well, I have discovered that Client Priority of 0 is best under Win 2 K...my Foldtraj went from 65 to 53.5 seconds :|party|:

[da'rayven]
08-11-2003, 02:01 PM
Client bench doesn't work on the MacOS X client anymore

At least on two machines, the bench always dies with a generic Error 3... :(

As for the PCs:

Athlon XP @ 2.3GHz, 200FSB, 512MB Dual Channel 11-3-2-2.5 PC3200, Windows XP


One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 6.609 0.344
Foldtraj 32.297 6.109

Press any key to continue . . .

Athlon XP Palomino @ 1.66GHz, 145FSB, 128MB 5-2-2-2 PC2100, SuSE Linux 8.2


One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 5.010 0.420
Foldtraj 58.848 7.600

P4 Northwood 2.0GHz @ 2.6GHz, 130FSB, 512MB 4-2-2-2 PC2700, Windows XP


One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 9.391 0.547
Foldtraj 37.609 11.984

dtsang
08-16-2003, 11:20 AM
Apple,PowerPC G4,466,MacOSX10.1.5,15.600,0.000,172.490,0.000

[da'rayven]
08-16-2003, 11:40 AM
okay, in that case, either it doesn't work on 10.2.x, or it will work with the new client.... I'll try again later ;)

dtsang
08-16-2003, 06:18 PM
Originally posted by [da'rayven]
okay, in that case, either it doesn't work on 10.2.x, or it will work with the new client.... I'll try again later ;)

I have a feeling it will work just fine. The last client did NOT work on my machine with 10.1.5 - it would fail on the trajectory thing (after gen0). It works absolutely perfectly now (with the exception of the native.val mixup).

Let me know if you are unable to get the current client to run on 10.2.x. I have a 10.3 beta installed and could see if it runs on that - if it runs fine on the 10.3 beta, then it should run under Jaguar.

[da'rayven]
08-16-2003, 07:08 PM
The client runs. It's the benchmark that doesn't :D I have been folding with my Macs as long as I have been folding ;) What I'm saying is its a few weeks since I tried, and maybe the new client's bench will work...

erk
08-24-2003, 07:57 AM
Originally posted by TheOtherPhil
Actually Grumpy, I am not convinced that it does. I am estimating that a dual AMD is something like 70% efficient for DF....if that. I'm personally running 4x dual AMD's and a P4 (~19.6GHz). 24/7 power is 2x duals and the P4 (11.8GHz). The part time dual's (~7.8GHz) run ~8hrs a day. All run as a service with useram=1.

My daily output is ~240K/ day. I really should be getting much higher than that I feel with the power I have invested in this project.

I am going to conduct a small test within the next few weeks where I remove the procs from my 2x full time dual's and run them in uni-processor boards for a while. I am expecting to see significantly higher numbers (~+30%).

I am not getting the SMP results either, K7D mothboard with a pair of MP2800+, I tried FreeBSD 2 versions and RedHat 9.0. FreeBSD 4.8-RELEASE was the quickest but not by much:

Usr time Sys time
-------- --------
Maketrj 4.836 0.875
Foldtraj 58.367 3.242


My soltek SL-75FRN2 with XP2600+ and RedHat 9.0:

Usr time Sys time
-------- --------
Maketrj 3.590 0.620
Foldtraj 37.120 11.750

erk
08-30-2003, 10:46 PM
Could someone please explain exactly what the four numbers returned by the benchmark mean?

dano
10-11-2003, 11:52 PM
athlon 64 3200
gigabyte k8vt800pro
256 mb apacer pc3200 cl3

winXP 32bit

Maketrj 5.939, 0.300
Foldtraj 36.663, 5.438

Mandrake linux 64 bit bata

Maketrj 2.210, 0.470
Foldtraj 33.340, 3.880

Grumpy
10-12-2003, 10:54 AM
CL3 Ram :bonk:


Nice time for ram at that speed...can you get the Ram down to 2.5 and try it ?

And Mandrake 64 Bit seems to be getting some extra juice too, it is a good sign for the 64 Bit Code running 32 Bit Apps at faster speeds :cheers:

dano
10-12-2003, 12:30 PM
Nice time for ram at that speed...can you get the Ram down to 2.5 and try it ?

I only have an adjustment for the ram voltage in the bios.:(
The memory controler is intgrated into the cpu so I guess the CAS timing is not adjustable.

Here is the benchmark at 2.2 gig :D

Maketrj 1.99, 0.480
Foldtraj 30.710, 3.300

HaloJones
10-12-2003, 01:44 PM
I'd love to know a bit more about this benchmark program. TheOtherPhil got sub-20 seconds with his P4 at 3.7GHz. My XP @2400 gets 35 seconds which suggests that the benchmark program reflects pure MHz. Yet my office 2400MHz P4s suck producing much slower than my home Athlons. (I'll benchmark a sample office P4 tomorrow.)

Most crunchers here seem to agree that Athlons are faster that P4s at DF so how come the benchmarks don't reflect that? Is the benchmark representative?

What does it actually mean?

Grumpy
10-12-2003, 06:09 PM
P4 + 800 FSB + 865/875 + HT = Below 20 Seconds :|party|:

Whether this transfers to real world speed over the Athlons is another question.. it is possible that only the benchmark gets a boost from the above points, I doubt we will ever prove or disprove it :dunno:

HaloJones
10-13-2003, 03:00 AM
Originally posted by Grumpy
P4 + 800 FSB + 865/875 + HT = Below 20 Seconds :|party|:

Whether this transfers to real world speed over the Athlons is another question.. it is possible that only the benchmark gets a boost from the above points, I doubt we will ever prove or disprove it :dunno:

P4 @ 3.7GHz in a phase change cooled computer. Unless DFII uses SSE2 or Netburst, it cannot be able to compute DF as fast per MHz as an Athlon simply due to the number of instructions per clock cycle. I'm not trying to re-start an old argument here but programs have to be specifically written for P4s to take advantage of them. A simple x86 routine is quicker on AMD than Intel.

Perhaps Howard could enlighten us on what the benchmark actually does.

HaloJones
10-13-2003, 07:04 AM
As promised:

Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 9.438 0.750
Foldtraj 47.328 15.922


P4-2400 (W2K)

Athlon XP at the same clockspeed does 35 seconds.

bwkaz
10-13-2003, 07:02 PM
Originally posted by HaloJones
Unless DFII uses SSE2 or Netburst, it cannot be able to compute DF as fast per MHz as an Athlon simply due to the number of instructions per clock cycle. And even if it does use SSE2 (or Netburst? dunno, I'm not familiar with what Netburst is), it still won't be able to compete with the Athlon.

The vast majority of the DF client's time is spent chasing pointers (AKA, doing integer arithmetic on memory addresses), not doing floating-point stuff. That's why the current client doesn't even use SSE (and may not use MMX, either) -- there's simply nothing to be gained from it, because that's not where the code hot-spots are.

Brian the Fist
10-14-2003, 12:24 PM
I think I've mentioned before but the benchmark builds a .trj file (trajectory distribution) for one particular sequence, and then builds 100 structures of it (like gen. 0). The protein is always the same, regardless of what protein we are working on. The random seed is fixed as well, so the procedure is completely deterministic (will always make the same 100 structures). Unfortunately this doesn't hold true across different operating systems as the floating point rounding error seems to vary on different platforms which in turn influences the sequence of events.


Thus it should reflect well the performance of the actual client in most cases.

TheOtherPhil
10-18-2003, 08:54 AM
Mike, the P4 I was using was running a 266fsb (1064MHz Quad Pumped) with the RAM at very aggressive timings (~6GB/s mem bandwidth (http://homepage.ntlworld.com/phil.harling/images/P4/285mem.gif) Sandra Bench). Clock for clock the Athlon may be faster but the P4 in question has a 1.3GHz Clock speed advantage over your 533fsb 2.4's and almost double the effective FSB.

FWIW, the P4 chewed through DF extremely fast and pretty much equalled my dual barton's at 2.3GHz in output.

Grumpy
10-18-2003, 06:23 PM
Yer, the OCed PIV running above 800 FSB is ugly :|party|:

And TheOtherPhil, your Signature is scaring the children, damn snoop coder :bonk:

HaloJones
10-19-2003, 07:35 AM
My office P4s are almost certainly 400MHz FSB since they are "cheap" and nasty office-use Compaq Evos.

Off-topic: why do businesses allow themselves to get so badly ripped off by the big manufacturers? I'm all for buying suoer-stable machines but do they need to be s o s l o w?

dtsang
10-19-2003, 11:58 AM
Does anybody here have a Power Mac G5? I would love to see how one of those performed, cause my G4 just plain sucks in dfold.:trash:

[veix]
10-20-2003, 04:03 AM
Not making the best best perfomance or getting under 20sec, but a interesting numbers imo.
Pentium-M "Centrino" 1,4Ghz WinXP Home SP1 256MB DDR266
Maketrj 7.571 0.651
Foldtraj 44.304 10.715
Wonder if it is possible to run that cpu on desktop motherboard :P

erk
10-20-2003, 06:08 PM
Originally posted by [veix]
Not making the best best perfomance or getting under 20sec, but a interesting numbers imo.
Pentium-M "Centrino" 1,4Ghz WinXP Home SP1 256MB DDR266
Maketrj 7.571 0.651
Foldtraj 44.304 10.715
Wonder if it is possible to run that cpu on desktop motherboard :P

There are Mini-ITX motherboards coming out for it.

http://www.lippert-at.com/miniitx.html

matitaccia
11-17-2003, 05:42 PM
p4m 1,8GHz, 512mb, WinXp Sp1

Summary
-------
Usr time Sys time
-------- --------
Maketrj 17.085 0.781
Foldtraj 64.803 18.166

Sum = 64.803 + 18.166 = 82.969

Didn't know what was better to do... have tried to write everything I had.
EhEH...

Ciao!

Xelas
11-21-2003, 12:19 PM
P4C 2.4 @ 3.0, (800 MHz FSB oc'd -> 1000)
512 MB PC3200 CL3 noname RAM (Samsung chips)

One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 7.156 0.484
Foldtraj 31.156 8.563

This result is with memory running synchroniously (250 MHz DDR = 500 MHz, 1 GHz FSB) but with very loose timings 3-4-3-6. Even so, my mem voltage is at 2.95 volts. PAS is set to "Ultra Turbo" (fastest).

With Memory running asynch at native PC3200 speeds (200 MHz DDR = 400 MHz, 1 GHz FSB) I can set timings to 2-2-2-5, but the machine runs a tad slower, giving something around USR = 33 SYS = 8.7

System is Win XP Pro.

yujen
12-18-2003, 11:33 PM
Running on an AMD Opteron 240 with 4gb ECC/Registered DDR333
Linux 2.6.0 SMP 32-bit NUMA Optimised

---
One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 4.250 0.850
Foldtraj 36.880 11.280


---

If would appreciate if anyone knows how to configure the benchmark to run on 2 processors simultaenously.

I ran 2 benchmarks in 2 separate windows "almost" simultaneously (press enter, switch to another window, press enter again) I achieve roughly the same output as above.

Grumpy
12-19-2003, 06:20 AM
The way you describe is how I do it. Just swap windows and run the second :)

What Client was that tested on. The regular or one of the Test Clients. It is very fast for a 140. Almost as fast as a 3200 Barton @ 200 FSB :scared:

Damn, I am saving up now, forget the Athlon64 3000, I want a Opteron Duallie after all :|party|:

O yeah, is it the Iwill MB, and what video card is it running etc etc

Brian the Fist
12-19-2003, 10:45 AM
Please note that for the recent beta clients, and the new one being released now (as indicated in the whatsnew.txt), the benchmark is no longer comparable to past benchmarks, due to the changes made to the algorithm. Interestingly, the new benchmark can show how much the algorithm has been sped up compared to the old algorithm. Please don't base hardware decisions on old vs. new benchmarks therefore :D

Grumpy
12-19-2003, 04:23 PM
Yeah, that is why I asked for the Client Version. But it would have to have been done with the 108 I imagine, so 37 is very fast for the 140 all the same. It appears Linux 64 is running the Client a lot faster than Linux 32 Bit, even without a recompile :cheers:

Umm, what precision are the numbers running at with the Client Howard...64, 72 :confused:

yujen
12-19-2003, 08:36 PM
Ooops, my apologies, forgot to mention thats using the new client so yeah, the numbers aren't that great... I'm more interested in how well it scales in SMP for NUMA vs non-NUMA which is why I asked if theres a better way to run 2 clients simultaneously.

Grumpy
12-19-2003, 10:58 PM
I get 36 seconds for Foldtraj with the new updated Client on a NF2 MB and Barton @ 2275 Mhz, so if the 240 1.4 Ghz Opteron gets close to this, I am very very impressed with your configuration :cheers:

yujen
12-19-2003, 11:08 PM
Originally posted by Grumpy
The way you describe is how I do it. Just swap windows and run the second :)

What Client was that tested on. The regular or one of the Test Clients. It is very fast for a 140. Almost as fast as a 3200 Barton @ 200 FSB :scared:

Damn, I am saving up now, forget the Athlon64 3000, I want a Opteron Duallie after all :|party|:

O yeah, is it the Iwill MB, and what video card is it running etc etc

ya, thats using the Iwill DK8SL with on-board ATi RageXL video... its not the DK8X workstation board unfortunately.

I've noticed quite a bit of performance improvement 2 days ago when I switched to 2.6.0 NUMA optimised... I think the difference is NUMA vs non-NUMA on the Opteron since the client is compiled in 32-bit so using a 64-bit kernel won't net any real benefit unless the calls to system libraries benefit from 64-bit in some way :)

Since each client uses up to 150mb of RAM, then theres real benefits to be had with NUMA :D

Grumpy
12-19-2003, 11:13 PM
If the numbers being crunched are 64 bit precision, then it will make a heck of a difference as it can run it native and noyt have to emulate 64 bit precision :|party|:

So it is a dual 240 system and is the MB have the ram shared so cpu 2 goes through cpu1 for memory ?

yujen
12-19-2003, 11:36 PM
No, it has 8 ram slots, 4 per processors in a 4+4 configuration :) (Iwill doesn't make castrated motherboards in a 4+0 configuration)

I use 4 sticks of 1GB ECC/Registered DDR333, so 2 sticks per processor with both processors running in 128-bit memory path.

If CPU1 has to go through CPU0 for ram then NUMA optimisations means sqat :)

True if the client uses double precision floating point computations, then compiling for x86-64 "may" see quite a significant improvement... it comes down to 32-bit vs 64-bit although if the Ultrasparc numbers aren't anything to write home about, going to 64-bit may not be an improvement if any.... you'll probably have to hack the code somewhere since a "double" on a 64-bit architecture means 128-bit precision if your only aiming for 64-bit then you're doing more work than you need to.

Grumpy
12-19-2003, 11:40 PM
Hmmm, the last Opteron 140 benchmark someone post my Barton was 32 seconds and the 140 46 seconds in the Fold Benchmark

Thor
12-20-2003, 05:18 AM
These are the benchmarks for the [b]new[/] client on a P4 2.66Ghz with 512MB PC1066 Rambus and Win2K:


One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 8.582 0.340
Foldtraj 34.329 9.113


I think thats a pretty good score...

can anybody else post some new benchmarks?


Greets thor

Grumpy
12-20-2003, 06:28 AM
Here is my Dual 2400 MP


Summary
-------
Usr time Sys time
-------- --------
Maketrj 9.266 0.484
Foldtraj 59.859 11.875


And my AMD Barton @ 2275 Mhz


Summary
-------
Usr time Sys time
-------- --------
Maketrj 6.156 0.234
Foldtraj 36.578 6.703

pfb
12-20-2003, 06:49 AM
Originally posted by Grumpy
Here is my Dual 2400 MP

Summary
-------
Usr time Sys time
-------- --------
Maketrj 9.266 0.484
Foldtraj 59.859 11.875



My Dual 2400MP (@2600+ speeds):



Summary
-------
Usr time Sys time
-------- --------
Maketrj 8.016 0.500
Foldtraj 58.078 11.531

Not much difference...

Grumpy
12-20-2003, 08:41 AM
I run Win 2 K Prioity 0 :cheers:

iggy
12-20-2003, 10:20 AM
Barton @ 2226 MHz, WinXP:


Usr time Sys time
-------- --------
Maketrj 5.984 0.203
Foldtraj 34.234 5.906

AXP1700+ @2205 MHz:


Usr time Sys time
-------- --------
Maketrj 6.703 0.391
Foldtraj 36.234 5.859

AXP2000+ (1667 MHz), Win2K3:


Usr time Sys time
-------- --------
Maketrj 9.859 0.594
Foldtraj 59.813 9.344

AXP2000+ (1667MHz), Knoppix 3.3:


Usr time Sys time
-------- --------
Maketrj 4.850 1.010
Foldtraj 56.470 17.800

gistech1978
12-20-2003, 11:02 AM
2100+ oced to 2600+
2139 (186*11.5)

Maketraj Usr Time = 7.078
Maketraj Sys Time = .422
Foldtraj Usr Time = 37.641
Foldtraj Sys Time = 6.719

TazAmdmb
12-20-2003, 02:03 PM
Opteron 146 @ 2 GHz
Summary
-------
Usr time / Sys time
-------- --------
Maketrj 5.672 / 1.031

Foldtraj 28.719 / 4.641

hega
12-20-2003, 04:08 PM
Intel P4 CPU 2.60GHz, 800MHz FSB
Linux 2.4.23

Usr time Sys time
-------- --------
Maketrj 2.590 0.610
Foldtraj 28.330 9.620

dano
12-20-2003, 04:33 PM
Athlon 64 3200+

Mandrake 9.2 64bit

Usr time / Sys time

Maketrj 3.220 / 0.490

Foldtraj 31.900 / 5.010

Lucus Maximus
12-23-2003, 07:32 AM
AXP2500+ @ 2.4GHz with 200Mhz FSB (RAM sync w/ CAS2.0-2-2-11)
WinXP Pro

Summary
-------
Usr time / Sys time
-------- / --------
Maketrj 5.703 / 0.438
Foldtraj 34.453 / 6.453

------------------------------

AXP2500+ @ 2.5GHz with 200Mhz FSB (RAM sync w/ CAS2.0-2-2-11)
WinXP Pro

Summary
-------
Usr time / Sys time
-------- / --------
Maketrj 5.672 / 0.203
Foldtraj 33.813 / 6.078

-Lucus :)

Gortok
12-23-2003, 09:39 PM
Barton 2500+@2260 2x256 HyperX 3500@205 2.0-3-3-6 NF7-S v2

One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 6.266 0.953
Foldtraj 34.609 6.688





:cool:

pharm24
12-27-2003, 11:32 PM
AMD,Athlon XP, 2224,Win2K,6.578,1.141,37.188,8.234

AAdjuster
01-08-2004, 07:49 PM
Barton 2800 @ 2400mhz-10.5x228
2x256 Corsair 11-2-2-2
NF7 v2.0

usr time system time

Maketrj / 5.538 0.250
Foldtraj / 31.976 4.717

TheOtherPhil
01-15-2004, 04:52 PM
AMD FX-51, 1GB Corsair XMS3200RPT - completely stock (for now ;)). WinXP Pro SP1


Summary
-------
Usr time Sys time
-------- --------
Maketrj 5.266 0.219
Foldtraj 26.766 3.781

Lucus Maximus
01-17-2004, 03:59 AM
Update:
AXP2500+ @ 2.456GHz with 223Mhz FSB (RAM sync w/ CAS2.0-2-2-11)
WinXP Pro

Summary
-------
Usr time / Sys time
-------- / --------
Maketrj 5.508 / 0.250
Foldtraj 32.627 / 5.708

-Lucus :)

erk
01-23-2004, 05:35 PM
Anyone got -bench results for an overlcocked A64 yet?

Grumpy
01-23-2004, 09:14 PM
I have seen a 3200 Athlon 64 OCed to more than 2.1 Ghz I believe it got something like..

5.656_____ 0.141
30.313____ 4.828

:rolleyes:

JohnyDog
01-30-2004, 12:56 AM
AMD Tbird 1333 Mhz, Linux 2.4.21

Usr time Sys time
-------- --------
Maketrj 8.260 0.520
Foldtraj 75.340 11.080

TheOtherPhil
01-30-2004, 05:58 AM
FX51 @ 2.46GHz, WinXP Pro SP1, 1Gb Corsair Reg. PC3200:


Summary
-------
Usr time Sys time
-------- --------
Maketrj 4.750 0.219
Foldtraj 24.391 3.438

HaloJones
01-30-2004, 12:35 PM
The Genome Collective gets the fastest cruncher :drums:

frmky
01-31-2004, 07:49 PM
One more...

Opteron 144 running at 1.8GHz
1GB PC3200 Reg. ECC Dual Channel RAM
Windows Server 2003 AMD64



Usr time Sys time
-------- --------
Maketrj 6.328 0.219
Foldtraj 30.141 6.844


Greg

Grumpy
01-31-2004, 08:23 PM
What Motherboard is that frmky :confused:

Nice result :drums:

frmky
02-01-2004, 03:19 PM
It's an ASUS SK8V.

I also tried the Linux version of the client. The ICC version gives a slower time for foldtraj:
2.49 0.41
35.95 5.08

and the gcc version won't run since it depends on ncurses-4 I don't have a 32-bit version of it installed.

Greg

bwkaz
02-01-2004, 05:03 PM
You can ln -s libncurses.so.5 libncurses.so.4 to use the gcc client, at least on a 32-bit system. I suppose that might not work on a 64-bit one, but I don't see why it wouldn't.

If you have a working libncurses.so.5 for the icc version, you can just use it, in other words.

frmky
02-01-2004, 06:54 PM
Yeah...that'll probably work once I figure out where the 32-bit libraries are stored. I've have a look at that later.
Greg

bwkaz
02-01-2004, 07:42 PM
I think it's just plain /lib and /usr/lib. I think that /lib64 and /usr/lib64 are for the 64-bit libraries.

However, I don't know for sure, since I have no Opteron / Athlon64 hardware. :(

frmky
02-01-2004, 09:53 PM
/lib and /usr/lib are 64-bit, /lib32 and /usr/lib32 are the 32 bit libraries. I created the link and followed up with env-update, but it still claims it can't find libncurses.so.4. I checked to make sure those paths were in /etc/env.d, and they are. Even ran ldconfig manually just to make sure. Not sure what's up. Oh well...and I was hoping to break the 30 second barrier. ;)

BTW, I noticed that there are 64-bit binaries for a few other platforms. Is it a simple recompile? Are they faster than the 32-bit ones? If so, I'm willing to make my machine available to make linux-amd64 and freebsd-amd64 binaries. :D

Greg

bwkaz
02-01-2004, 09:59 PM
Maybe /sbin/ldconfig?

I'm not sure what env-update is supposed to do, nor am I sure what /etc/env.d is for; they sound like distro-magic tools to me. ;) Though it's always possible that they're for glibc on x86-64...

IIRC from the past, the DF people need a machine that they have physical access to before they can make binaries for that architecture. I don't remember why for sure, though...

But I think that with one of them (Solaris or IRIX) the 32-bit and 64-bit versions are about the same speed anyway. No idea if the same will hold true on Intel, but it might.

frmky
02-01-2004, 10:20 PM
Nope...still no go. I give up on it. :confused: :trash:

OK..back on topic now....sorry about that! :rolleyes:

gistech1978
02-05-2004, 09:44 AM
im having a strange problem...
so the benchmark is completely independent of the current protein?
the reason i ask this is because earlier in this thread i have posted some benches where my maketraj sys time was .422 (186x12) for my current setup or .359 running stock.

well now i have some pc3200 and running a 2100+ at 200x10.5
i benched and got some results basically on par with the results before.

this past weekend i reformatted my hdd and reinstalled xp. everything is essentially the same, and now my maketraj sys time is well over 2 and the others have noticably slowed as well, im at the office right now but i got something like 2.xxx. i tried various ram timings and its all basically the same. right now i have found the timings 2.0-2-3-11 or 2.0-3-3-11 to give me the most memory bandwidth on the nforce2 chipset.

im confused.

Bionic_Redneck
02-08-2004, 10:19 PM
XP1800+
256mb ddr
Gentoo linux
ICC client
Note: 3 different kernels ran benchmark 3 time on each for accuracy


Kernel 2.4.22

Summary
-------
Usr time Sys time
-------- --------
Maketrj 5.960 0.510
Foldtraj 72.220 9.010

Kernel 2.6.1

Summary
-------
Usr time Sys time
-------- --------
Maketrj 6.050 0.490
Foldtraj 71.720 11.220

Kernel 2.62

Summary
Usr time Sys time
-------- --------
Maketrj 5.190 0.890
Foldtraj 59.200 15.970

Bionic_Redneck
02-08-2004, 10:50 PM
kde 3.2 konsole uses 3-5% of cpu by using aterm instead I dropped times a little


Summary
-------
Usr time Sys time
-------- --------
Maketrj 5.130 0.900
Foldtraj 58.860 15.730

Thor
02-09-2004, 08:50 AM
@ Bionic_Redneck
buy another 256MB RAM ....

My benches on an XP2800 using W2K, 512MB RAM @166Mhz:


usr time: sys time:
10.045 1.512
54.759 9.854


That's about a 4second difference....


Greets Thor

junky
02-10-2004, 05:18 AM
gonna ask a stupid question:
whats these args ?
Maketrj usr, Maketrj sys,Foldtr\
aj usr, Foldtraj sys

i see a lot of number there, how to get these numbers.
thanks.

bwkaz
02-10-2004, 08:36 AM
You get those numbers by running ./foldtrajlite -bench (or .\foldtrajlite -bench on Windows).

I'm not sure what the difference between Maketrj and Foldtraj is (I think the first one is what the client does at the end of a generation, and the second one is what the client does most of the time otherwise), but the difference between "usr" and "sys" time is that "sys" time is spent running kernel code (that the DF program calls). "usr" time is spent running code in the DF program itself.

Grumpy
02-10-2004, 04:57 PM
Thor, that is a very slow result for your computer specs...did you have other stuff running at the time :cry:

iggy
02-10-2004, 05:51 PM
Probably a typo - I guess it should have been AXP1800+...

Thor
02-10-2004, 06:29 PM
Thats right, my mistake...it was late at night over here when I wrote that. 12hour learning session isn't any good for a brain :crazy: :bonk:
But maybe I was just thinking again about getting myself a new XP2500 and wished there would be enough money for a 2800+ ;)
But as always , students never ever have enough money...need a new job to be able to spend some money again:D

Greets Thor

Grumpy
02-11-2004, 05:17 PM
:sleepy: Oops, sleep deprivation is evil....

Bionic_Redneck
02-11-2004, 09:16 PM
Ok Thor I was using ICC client switched to GCC

Summary
-------
Usr time Sys time
-------- --------
Maketrj 6.010 0.890
Foldtraj 52.470 15.720

isp
02-18-2004, 03:50 PM
So I'm seeing these numbers but would like to ask a question...

If someone wanted to build a mini farm of folders for df, what would you guess would yield the most output...

3x 2.4c @ 3.2 w/ ht on
or
5x amd 2500 or 2600+ at stock

TIA :D

doritos
03-17-2004, 02:05 AM
HP PA-8500 (440Mhz)

Summary
-------
Usr time Sys time
-------- --------
Maketrj 11.470 0.520
Foldtraj 145.640 3.330


HP PA-8700 (875Mhz)

Summary
-------
Usr time Sys time
-------- --------
Maketrj 7.110 0.430
Foldtraj 79.680 1.730

Bionic_Redneck
03-17-2004, 08:26 AM
Originally posted by isp
So I'm seeing these numbers but would like to ask a question...

If someone wanted to build a mini farm of folders for df, what would you guess would yield the most output...

3x 2.4c @ 3.2 w/ ht on
or
5x amd 2500 or 2600+ at stock

TIA :D

I wouldn't put alot of stock in some of the p4 benchmarks...Dont hold me to it but I believe they were posted before the protien update.

isp
03-17-2004, 08:32 AM
Originally posted by Bionic_Redneck
I wouldn't put alot of stock in some of the p4 benchmarks...Dont hold me to it but I believe they were posted before the protien update.

Thanks, I decided to go with AMD.

Welnic
03-17-2004, 11:27 AM
Originally posted by Bionic_Redneck
I wouldn't put alot of stock in some of the p4 benchmarks...Dont hold me to it but I believe they were posted before the protien update.

The benchmark does not have anything to do with the current protein. The protein has no affect on it, that is why it is a benchmark. If the client itself changes, which it does do, that can change the benchmark.

Bionic_Redneck
03-18-2004, 07:21 AM
so let me get this straight benchmark doesn't change with a new protein but does with a new client and we get a new client ever time there is a new protein. Am I missing something here?

Paratima
03-18-2004, 07:47 AM
If you check the date on the current foldtrajlite.exe, you will probably find it to be 2/10/2004. The client did not change this time, only the protein files. :)

Welnic
03-18-2004, 01:27 PM
And most client changes are various bug fixes. The only thing that would change the benchmark would be a change in the algorithm.

Lucus Maximus
05-21-2004, 02:05 PM
A64 3200+ @ 2280MHz, HTT 228Mhz/Mem @ 190Mhz:

Summary
-------
Usr time Sys time
-------- --------
Maketrj 2.906 0.172
Foldtraj 27.453 5.578

Press any key to continue . . .

A64 3200+ Stock

Summary
-------
Usr time Sys time
-------- --------
Maketrj 3.281 0.109
Foldtraj 29.891 5.391

Press any key to continue . . .

Using an ASUS K8V Deluxe, 512MB Corsair XMS3500 & WinXP Pro

Awaiting NF3 250 boards here in Australia and will get me WaterCooling going on that setup with some OCZ 3700EB :D Hoping for 2.4GHz - 2.6GHz :)

-Lucus :)

deranged128[OCAU]
05-22-2004, 08:32 AM
Another benchie, just for the heck of it.

This one run on a superlocked XP2500+ Barton @ 2230Mhz, 11 x 202 w/ 1.85 Vcore. Ram is is DDR 400, generic with timings set at 2.5-3-3-6, running at 1:1 with vdimm at 2.8v on ASUS A7N8X-X Deluxe. OS is FreeBSD 5.1


Summary
-------
Usr time Sys time
-------- --------
Maketrj 5.828 0.203
Foldtraj 34.781 7.672

Paladin
05-30-2004, 12:57 PM
I did a bunch of benchies while anteater & the DF staff was on vacation...
FSB has a greater impact than overall clock speed, with a delta of 100Mhz or less. Gentoo Linux clients run faster than the Win2K. On P3 and P4 the ICC client runs faster than the GCC.

Following results are not "golden samples", just first run from clean boot.

P3@1250Mhz, 512MB SDRAM, Win2K
Maketrj 12.889 0.571
Foldtraj 113.253 32.867

same as above on Gentoo Linux 2.6.5 w/gcc
Maketrj 9.580 2.080
Foldtraj 91.940 36.030

same as above w/icc
Maketrj 7.110 1.950
Foldtraj 89.350 37.390


P4@1610Mhz, 128MB RDRAM, Win2K
Maketrj 8.172 0.734
Foldtraj 50.828 14.516

same as above on Gentoo Linux 2.6.5 w/gcc
Maketrj 11.480 0.830
Foldtraj 48.900 14.590

same as above w/icc
Maketrj 4.950 0.800
Foldtraj 48.270 15.340


XP2400+@2250Mhz, 512MB SC-DDR, Win2K
Maketrj 6.672 0.500
Foldtraj 39.859 7.984

qbain
05-30-2004, 06:10 PM
P3-S 1266 / 512MB CL2 PC133 / Linux 2.6.6

GCC:
Maketrj 6.760 1.500
Foldtraj 65.840 18.090

ICC:
Maketrj 4.710 1.450
Foldtraj 62.470 18.570

Version currently used (seems to be newer than the ones on the download page):
Maketrj 4.580 1.500
Foldtraj 60.990 18.490

Basman
06-11-2004, 05:04 PM
Would a processor like a Presscot WITH 1024 kb of L2 cache instead of only 512 give a big performance boost? comparing this to the athlons en Northwoods??

Grumpy
06-11-2004, 10:37 PM
It does help, though I am unsure by how much, I imagine not a great deal. I would avoid Prescotts for Folding due to the heat issues, especially with this protein ..

:flame: :lmao:

FEEDB0B0
06-12-2004, 10:08 PM
Stock G5 Mac Dual 1.8GHz/512M Ram

Single Instance:
Summary
-------
Usr time Sys time
-------- --------
Maketrj 4.910 1.610
Foldtraj 39.380 26.640


Seems to eat system time with this fancy user-interface GUI.
Running -bench simultaneously in 2 processes does pretty much the same
if that matters to anyone.

Cheers
Christopher Hogue

La Muis
06-13-2004, 05:35 AM
G5 Mac Dual 2GHz/2.5G Ram
Mac OS 10.3.4

(only running ./foldtrajlite -bench)

Summary
-------
Usr time Sys time
-------- --------
Maketrj 4.170 1.270
Foldtraj 34.380 20.660



(running ./foldit and ./foldtrajlite bench simultaneously)

Summary
-------
Usr time Sys time
-------- --------
Maketrj 4.320 1.150
Foldtraj 35.300 18.810

Hua Luo Han
08-02-2004, 10:26 PM
any benchmarks for the new Celeron 'D' procs ? :bouncy:

:trash: or :thumbs:

:confused:

-=N0N@ME420=-
08-03-2004, 09:57 AM
AMD 1600+ 1.4Ghz O/Ced to 1.56Ghz - Palomino Core

The system cannot find the path specified.
One moment, opening rotamer library...
Predicting secondary structure and generating trajectory distribution...
Folding protein...
Benchmark complete.

Summary
-------
Usr time Sys time
-------- --------
Maketrj 10.234 0.563
Foldtraj 63.766 10.703

Press any key to continue . . .

why do I get the first error and whats usr time and sys time?

bwkaz
08-03-2004, 06:06 PM
The error may be because you didn't put a .\ in front of the foldtrajlite command (if you didn't) -- I believe that on Windows you have to do that for it to work right.

Sys time is time spent inside the OS kernel. For example, when a program calls an OS function to read data from a file into a buffer, the time that the OS spends actually doing the reading from disk (or from memory, if your OS is smart enough to cache previously accessed file data... ;)) and copying it into your buffer is sys time.

Usr time (an abbreviation for "user") is time spent running the program's code -- the various loops, conditionals, and recursive function calls (if any) that exist in all usermode code. On Windows, this probably includes code located in DLLs that didn't come with the system (depends on what the DLLs are for), and it's debatable whether it also includes DLLs like comdlg32 that perform systemwide functions like bringing up common dialog boxes. It definitely excludes time spent inside things like the video driver. It may or may not exclude the time spent by csrss.exe doing screen updates (that depends on what the OS counts as user time vs. system time).

erk
08-21-2004, 07:29 AM
Any comparisons between an A64 socket 754 vs socket 939 at the same GHz, OS etc?

tpdooley
08-21-2004, 02:12 PM
I'll have to track down the chart of Athlon 64 model numbers versus Ghz - but the Athlon64 3000+ 754pin cpu is say a 2Ghz part. The same Ghz speed 939pin cpu is a Athlon64 3200+.
So it might be nice to see an Athlon64 3000+ 754pin cpu compared to the 939 pin Athlon64 3000+ and the 939 part running at the same Ghz as the Athlon 754 pin cpu tested.

tpdooley
08-30-2004, 02:59 PM
http://www.theregister.co.uk/2004/08/30/orion_delivers_personal_cluster/

anyone have one of the 12 cpu Orion clusters to benchmark? :)

ton80
09-01-2004, 01:46 AM
AMD A64 3200+ Socket 754 2205 MHz WinXP Pro SP2 3.328 0.234 32.547 6.875