amdzone.com has some Seti@home benchmarks just up
Has anybody ran DF on an Opteron system yet? Has anybody seen word of the performance elsewhere?
Cheers,
Phil.
amdzone.com has some Seti@home benchmarks just up
I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.
Spending finger getting itchy again, Phil?
I noticed links to the Seti benchmarks over the last few days; and they had similar results with 64 bit code as we've had here; i.e. the 64 bit client ended up being slower than the 32 bit client. I'll be nice to see the benchmark differences between the 32 clients running Opteron/Athlon64 vs normal Athlon/p4, though.
Originally posted by HaloJones
Spending finger getting itchy again, Phil?
Just evaluating potential future purchases . TBH, the Opteron hasn't impressed me with it's benchmarks so far, yet Intel seems to be pulling further and further ahead. My current P4 absolutely rocks and I am probably going to go the Xeon route when Intel implement the 800MHz fsb for that platform.
Train hard, fight easy
Psssst, anyone visited http://www.ocworkbench.com lately
I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.
I have a dual Opteron 242 (1,6GHz) running here and yesterday I benchmarked the ECC2 and the distributed.net clients.
Both turned out as being equal to running 2 XP2000+
As long as there are no optimized client you can say that the number of GHz of Opterons equals the number of GHz of Athlon XP's
Proud member of the Dutch Power Cows
Both RC5 and ECC2 are clockspeed dependant. You can directly compare a duron to an athlon with those clients.
DF on the other hand should respond well to the Opteron's extra L2 cache and on-die memory controller. It'd be great if you could run the DF bench for me. This will help my next purchase decision.
Cheers.
Train hard, fight easy
usrtime systtime
6.65 0.156
32.5 5.141
This may be from an AMD64 3100XP, or it may not It may be an untweaked reference MB running at DDR400 default ram timings.
I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.
That looks similar to ~2.3GHz AXP. Not too bad but not as fast as I was expecting.
Train hard, fight easy
I think an Opteron 144 would be as fast, and I dint know how much cache is in this cpu, all will be revealed as I suspect a Review is in the making..still waiting for Opteron benches
I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.
THe 940 Pin AMD64 with 1 meg cache will be a good bit faster
I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.
If the DF client runs "twice as fast" with the -rt switch using up to 150MB of RAM, I would think maybe the 4+ GB of RAM potential could be utilized to spped things further. Not to mention the extremly low latencies of the integrated mem controller.
If DF could truly become multi-threaded I think you would see a much more significant boost for Opterons in Dual or higher configs. As it stants though I think all advantages are boing overcome by the low clock speed. (thank god for low IPC)
Wasn't there a stage or two of pipeline added too? If so AMD64 chips may hit their stride at closer to 2.5GHz due to speculative error penalties. (kinda like the P4 not really stacking up until it hit 2.2-2.4 GHz.)
And just for those who like to ride last year's pony - I saw today 1.2GHz Athlon MP's are $55 on pricewatch. So for the price of one 242 Opteron you could have two dualies totaling almost 5GHz! I love bleeding edge tech as much as the next guy but sometimes the argument just comes down to GHz per $$$.
Lastly, the "big" A64 with 1MB of cache essentially IS an opteron from everything I can see. It just won't be Dual-capable (maybe).
Me I want to stick an Opteron in a freezer with an insane core voltage and see just how mad-scientist I gan get.
-Self Denying Stat Ho-
I dont think so because there is nothing more to put in the memory ...Originally posted by AMDPHREAK
If the DF client runs "twice as fast" with the -rt switch using up to 150MB of RAM, I would think maybe the 4+ GB of RAM potential could be utilized to spped things further. Not to mention the extremly low latencies of the integrated mem controller.
Alls files that need to be cached are at most 80 mb.
The German DC Community : Team Rechenkraft.net - Join now ! Rechenkraft.net
Actually, I think that's a different chunk of storage you're considering. I don't have much Computer Science training, so I'll have to put this in more verbose layman's terms:I dont think so because there is nothing more to put in the memory ... Alls files that need to be cached are at most 80 mb.
The -rt switch determines whether the folding engine uses a smaller or a larger block of your system's RAM to set-up arrays (think of them as stacks of spreadsheets for the moment, though I'm pretty sure we're talking about more than two or three dimensions of data here) of data for computation on each generation. If you give DF a large block of RAM, it will set-up a cubic array as a tall stack of grids of data. If you select a smaller RAM footprint, it will hold in memory only the fewest number of grids of data possible to complete the computation. In the large memory model, the computing engine can set-up the data arrays once and then start calculating. In the small memory model, the computer must offload the old grids of data regularly to make room for newer info. This housekeeping takes extra time.
The proposal to take advantage of larger memory models (continuing my oversimplification) might keep the past couple of generations' data cubes in memory to aid somehow in calculation of the next generation of data.
Memory chips are much cheaper than they ever have been before. This condition allows programmers to do one of three things: 1) add more cup-holders, blinky-lights, bells and whistles to existing code (see M$oft), 2) perform less problem simplification and data reduction in order to spend more time on actual computation of complex systems (see the -rt switch), or 3) ignore the extra resources and let the user run more tasks.
If the folding algorithm can be made faster by keeping still more data in memory than it does today, it would be nice to see a -XRT switch that uses more memory (where available) in order to fold more protiens per hour.
-djp
I'm not a Stats Ho either. I just want to go and check to see that all my spare boxen are busy. Hang on a minute....
djp, the reason why the -rt switch uses more memory and speeds up the folding client is because it reduces the need to read from disk.
Without the -rt switch, the client needs to read the protein.trj file each time it works on a protein and this takes time since reading from the hard drive or a USB memory stick or CD or whatever is much slower than RAM.
Using the -rt switch loads the protein.trj file into memory so it no longer needs to read this data from disk each time it works on a protein.
That is why the readme says up to 150 MB of RAM because the size of the proteins the DF project is expecting to work on won't exceed that limit with the protein.trj file.
Hope that helps explain why using more memory wouldn't speed things up the way the client is currently designed.
Jeff.
These benches were done by an OCAU member in early july, i was hoping somone else would post them here.
Code:processor : 0 and 1 vendor_id : AuthenticAMD cpu family : 15 model : 5 model name : AMD Opteron(TM) 64 Processor 242 stepping : 0 cpu MHz : 1593.811 cache size : 1024 KB bogomips : 3178.49
Code:# > top 10:23pm up 5:37, 4 users, load average: 1.32, 1.36, 1.28 64 processes: 61 sleeping, 3 running, 0 zombie, 0 stopped CPU0 states: 99.1% user, 0.0% system, 98.0% nice, 0.1% idle CPU1 states: 99.0% user, 0.0% system, 99.0% nice, 0.1% idle Mem: 8071096K av, 2253444K used, 5817652K free, 0K shrd, 35476K buff Swap: 16779852K av, 0K used, 16779852K free 1861148K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 10428 user 39 19 88644 86M 1836 R N 99.9 1.0 30:37 foldtrajlite 10754 user 39 19 87916 85M 1784 R N 98.8 1.0 0:16 foldtrajlite 10756 user 15 0 1044 1044 772 R 0.9 0.0 0:00 top #1 > foldtrajlite -bench One moment, opening rotamer library... Predicting secondary structure and generating trajectory distribution... Folding protein... Benchmark complete. Summary ------- Usr time Sys time -------- -------- Maketrj 2.720 0.540 Foldtraj 37.310 4.300 #2 > foldtrajlite -bench <snipped> Usr time Sys time -------- -------- Maketrj 2.960 0.360 Foldtraj 36.780 4.780 #3 > foldtrajlite -bench <snipped> Usr time Sys time -------- -------- Maketrj 2.810 0.490 Foldtraj 37.230 4.400 # > uname -a # > cat /etc/UnitedLinux-release Linux servername 2.4.19-SMP #1 SMP Wed Feb 12 18:42:27 UTC 2003 x86_64 unknown UnitedLinux 1.0 (x86_64) VERSION = 1.0An experimental spare box, for only 2 weeks thoUnfortunately DF is compiled for a 32 bit proc under Linux.
So 32 bit execution is emulated. Not fully optimized yet.Unmodded, bios says ~1.56v for each CPU, ~70deg C max on load in a 1RU.
Look at those scores
Look at the results our Dual MPs get at 2 Ghz or higher and lookat the dual Opteron..I am speechless. I am phoning my bank Manager right now for a loan ang getting me a dual 246 System
Thanks for the info
I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.
Well, the truth of those benchmarks is the Opteron 140 is faster than a MP2400 @ 2 Ghz. So for those doing DC on Duallies, it is a far superior setup. Value for money,well, not right now
I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.
how to interpret those benchmarks ? I have a 142 at 1.8GHz.........
All I can remember is that lower numbers are better. Some of the numbers are much more important than the others, can't remember which though
Ignore the Maketrj numbers. Add the two Foldtraj numbers together. That sum tells how long it takes to fold a standard piece of work.Originally posted by Hua Luo Han
how to interpret those benchmarks ? I have a 142 at 1.8GHz.........
That's not bad, if I understand it correctly
Compare it to the 1.4Ghz Opteron I got access to:
Code:Summary ------- Usr time Sys time -------- -------- Maketrj 4.740 0.670 Foldtraj 47.440 8.750