Page 3 of 5 FirstFirst 12345 LastLast
Results 81 to 120 of 181

Thread: Call for Benchmarks

  1. #81
    Senior Member
    Join Date
    Jan 2002
    Location
    England, near Europe
    Posts
    211
    Originally posted by Grumpy
    So close to breaking 20

    Yeah, that's what I was aiming for. I'll try a higher o/c tomorrow
    Train hard, fight easy


  2. #82
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    If the Fire Brigade aint at ya front door, yor not OCing it enough
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  3. #83
    Senior Member
    Join Date
    Jan 2002
    Location
    England, near Europe
    Posts
    211
    Hmmm, deja vu.

    Anyway, I managed to break the 20 secs mark with:

    Intel P4, 3724MHz, WinXP Pro SP1, 1GB

    5.734, 0.500, 19.688, 8.406 (Screenshot)

    The system ran the Prime95 torture test for just over an hour before running the DF bench so it seems stable enough at that speed. The problem is the high vcore (1.8V)...it's just too high for my comfort to run 24/7. I have now dropped back to 3.5GHz and 1.65V.
    Train hard, fight easy


  4. #84
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    Well, I have discovered that Client Priority of 0 is best under Win 2 K...my Foldtraj went from 65 to 53.5 seconds
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  5. #85
    Junior Member [da'rayven]'s Avatar
    Join Date
    Aug 2003
    Location
    North London, UK
    Posts
    11
    Client bench doesn't work on the MacOS X client anymore

    At least on two machines, the bench always dies with a generic Error 3...

    As for the PCs:

    Athlon XP @ 2.3GHz, 200FSB, 512MB Dual Channel 11-3-2-2.5 PC3200, Windows XP

    Code:
    One moment, opening rotamer library...
    Predicting secondary structure and generating trajectory distribution...
    Folding protein...
    Benchmark complete.
    
    Summary
    -------
              Usr time  Sys time
              --------  --------
    Maketrj      6.609     0.344
    Foldtraj    32.297     6.109
    
    Press any key to continue . . .
    Athlon XP Palomino @ 1.66GHz, 145FSB, 128MB 5-2-2-2 PC2100, SuSE Linux 8.2

    Code:
    One moment, opening rotamer library...
    Predicting secondary structure and generating trajectory distribution...
    Folding protein...
    Benchmark complete.
    
    Summary
    -------
              Usr time  Sys time
              --------  --------
    Maketrj      5.010     0.420
    Foldtraj    58.848     7.600
    P4 Northwood 2.0GHz @ 2.6GHz, 130FSB, 512MB 4-2-2-2 PC2700, Windows XP

    Code:
    One moment, opening rotamer library...
    Predicting secondary structure and generating trajectory distribution...
    Folding protein...
    Benchmark complete.
    
    Summary
    -------
              Usr time  Sys time
              --------  --------
    Maketrj      9.391     0.547
    Foldtraj    37.609    11.984
    Proud member of the OCWorkbench Distributed Folding team

  6. #86
    Junior Member
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    27

    Re: Call for Benchmarks

    Apple,PowerPC G4,466,MacOSX10.1.5,15.600,0.000,172.490,0.000
    Derek

  7. #87
    Junior Member [da'rayven]'s Avatar
    Join Date
    Aug 2003
    Location
    North London, UK
    Posts
    11
    okay, in that case, either it doesn't work on 10.2.x, or it will work with the new client.... I'll try again later
    Proud member of the OCWorkbench Distributed Folding team

  8. #88
    Junior Member
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    27
    Originally posted by [da'rayven]
    okay, in that case, either it doesn't work on 10.2.x, or it will work with the new client.... I'll try again later
    I have a feeling it will work just fine. The last client did NOT work on my machine with 10.1.5 - it would fail on the trajectory thing (after gen0). It works absolutely perfectly now (with the exception of the native.val mixup).

    Let me know if you are unable to get the current client to run on 10.2.x. I have a 10.3 beta installed and could see if it runs on that - if it runs fine on the 10.3 beta, then it should run under Jaguar.
    Derek

  9. #89
    Junior Member [da'rayven]'s Avatar
    Join Date
    Aug 2003
    Location
    North London, UK
    Posts
    11
    The client runs. It's the benchmark that doesn't I have been folding with my Macs as long as I have been folding What I'm saying is its a few weeks since I tried, and maybe the new client's bench will work...
    Proud member of the OCWorkbench Distributed Folding team

  10. #90
    Originally posted by TheOtherPhil
    Actually Grumpy, I am not convinced that it does. I am estimating that a dual AMD is something like 70% efficient for DF....if that. I'm personally running 4x dual AMD's and a P4 (~19.6GHz). 24/7 power is 2x duals and the P4 (11.8GHz). The part time dual's (~7.8GHz) run ~8hrs a day. All run as a service with useram=1.

    My daily output is ~240K/ day. I really should be getting much higher than that I feel with the power I have invested in this project.

    I am going to conduct a small test within the next few weeks where I remove the procs from my 2x full time dual's and run them in uni-processor boards for a while. I am expecting to see significantly higher numbers (~+30%).
    I am not getting the SMP results either, K7D mothboard with a pair of MP2800+, I tried FreeBSD 2 versions and RedHat 9.0. FreeBSD 4.8-RELEASE was the quickest but not by much:

    Usr time Sys time
    -------- --------
    Maketrj 4.836 0.875
    Foldtraj 58.367 3.242


    My soltek SL-75FRN2 with XP2600+ and RedHat 9.0:

    Usr time Sys time
    -------- --------
    Maketrj 3.590 0.620
    Foldtraj 37.120 11.750

  11. #91
    Could someone please explain exactly what the four numbers returned by the benchmark mean?

  12. #92
    athlon 64 3200
    gigabyte k8vt800pro
    256 mb apacer pc3200 cl3

    winXP 32bit

    Maketrj 5.939, 0.300
    Foldtraj 36.663, 5.438

    Mandrake linux 64 bit bata

    Maketrj 2.210, 0.470
    Foldtraj 33.340, 3.880

  13. #93
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    CL3 Ram


    Nice time for ram at that speed...can you get the Ram down to 2.5 and try it ?

    And Mandrake 64 Bit seems to be getting some extra juice too, it is a good sign for the 64 Bit Code running 32 Bit Apps at faster speeds
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  14. #94
    Nice time for ram at that speed...can you get the Ram down to 2.5 and try it ?
    I only have an adjustment for the ram voltage in the bios.
    The memory controler is intgrated into the cpu so I guess the CAS timing is not adjustable.

    Here is the benchmark at 2.2 gig

    Maketrj 1.99, 0.480
    Foldtraj 30.710, 3.300

  15. #95
    I'd love to know a bit more about this benchmark program. TheOtherPhil got sub-20 seconds with his P4 at 3.7GHz. My XP @2400 gets 35 seconds which suggests that the benchmark program reflects pure MHz. Yet my office 2400MHz P4s suck producing much slower than my home Athlons. (I'll benchmark a sample office P4 tomorrow.)

    Most crunchers here seem to agree that Athlons are faster that P4s at DF so how come the benchmarks don't reflect that? Is the benchmark representative?

    What does it actually mean?
    Last edited by HaloJones; 10-12-2003 at 02:08 PM.

  16. #96
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    P4 + 800 FSB + 865/875 + HT = Below 20 Seconds

    Whether this transfers to real world speed over the Athlons is another question.. it is possible that only the benchmark gets a boost from the above points, I doubt we will ever prove or disprove it
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  17. #97
    Originally posted by Grumpy
    P4 + 800 FSB + 865/875 + HT = Below 20 Seconds

    Whether this transfers to real world speed over the Athlons is another question.. it is possible that only the benchmark gets a boost from the above points, I doubt we will ever prove or disprove it
    P4 @ 3.7GHz in a phase change cooled computer. Unless DFII uses SSE2 or Netburst, it cannot be able to compute DF as fast per MHz as an Athlon simply due to the number of instructions per clock cycle. I'm not trying to re-start an old argument here but programs have to be specifically written for P4s to take advantage of them. A simple x86 routine is quicker on AMD than Intel.

    Perhaps Howard could enlighten us on what the benchmark actually does.

  18. #98
    As promised:

    Predicting secondary structure and generating trajectory distribution...
    Folding protein...
    Benchmark complete.

    Summary
    -------
    Usr time Sys time
    -------- --------
    Maketrj 9.438 0.750
    Foldtraj 47.328 15.922


    P4-2400 (W2K)

    Athlon XP at the same clockspeed does 35 seconds.

  19. #99
    Senior Member
    Join Date
    Mar 2002
    Location
    MI, U.S.
    Posts
    697
    Originally posted by HaloJones
    Unless DFII uses SSE2 or Netburst, it cannot be able to compute DF as fast per MHz as an Athlon simply due to the number of instructions per clock cycle.
    And even if it does use SSE2 (or Netburst? dunno, I'm not familiar with what Netburst is), it still won't be able to compete with the Athlon.

    The vast majority of the DF client's time is spent chasing pointers (AKA, doing integer arithmetic on memory addresses), not doing floating-point stuff. That's why the current client doesn't even use SSE (and may not use MMX, either) -- there's simply nothing to be gained from it, because that's not where the code hot-spots are.
    "If you fail to adjust your notion of fairness to the reality of the Universe, you will probably not be happy."

    -- Originally posted by Paratima

  20. #100
    I think I've mentioned before but the benchmark builds a .trj file (trajectory distribution) for one particular sequence, and then builds 100 structures of it (like gen. 0). The protein is always the same, regardless of what protein we are working on. The random seed is fixed as well, so the procedure is completely deterministic (will always make the same 100 structures). Unfortunately this doesn't hold true across different operating systems as the floating point rounding error seems to vary on different platforms which in turn influences the sequence of events.


    Thus it should reflect well the performance of the actual client in most cases.
    Howard Feldman

  21. #101
    Senior Member
    Join Date
    Jan 2002
    Location
    England, near Europe
    Posts
    211
    Mike, the P4 I was using was running a 266fsb (1064MHz Quad Pumped) with the RAM at very aggressive timings (~6GB/s mem bandwidth Sandra Bench). Clock for clock the Athlon may be faster but the P4 in question has a 1.3GHz Clock speed advantage over your 533fsb 2.4's and almost double the effective FSB.

    FWIW, the P4 chewed through DF extremely fast and pretty much equalled my dual barton's at 2.3GHz in output.
    Train hard, fight easy


  22. #102
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    Yer, the OCed PIV running above 800 FSB is ugly

    And TheOtherPhil, your Signature is scaring the children, damn snoop coder
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  23. #103
    My office P4s are almost certainly 400MHz FSB since they are "cheap" and nasty office-use Compaq Evos.

    Off-topic: why do businesses allow themselves to get so badly ripped off by the big manufacturers? I'm all for buying suoer-stable machines but do they need to be s o s l o w?

  24. #104
    Junior Member
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    27

    G5 benchmarks?

    Does anybody here have a Power Mac G5? I would love to see how one of those performed, cause my G4 just plain sucks in dfold.
    Derek

  25. #105
    Not making the best best perfomance or getting under 20sec, but a interesting numbers imo.
    Pentium-M "Centrino" 1,4Ghz WinXP Home SP1 256MB DDR266
    Maketrj 7.571 0.651
    Foldtraj 44.304 10.715
    Wonder if it is possible to run that cpu on desktop motherboard :P

  26. #106
    Originally posted by [veix]
    Not making the best best perfomance or getting under 20sec, but a interesting numbers imo.
    Pentium-M "Centrino" 1,4Ghz WinXP Home SP1 256MB DDR266
    Maketrj 7.571 0.651
    Foldtraj 44.304 10.715
    Wonder if it is possible to run that cpu on desktop motherboard :P
    There are Mini-ITX motherboards coming out for it.

    http://www.lippert-at.com/miniitx.html

  27. #107
    p4m 1,8GHz, 512mb, WinXp Sp1

    Summary
    -------
    Usr time Sys time
    -------- --------
    Maketrj 17.085 0.781
    Foldtraj 64.803 18.166

    Sum = 64.803 + 18.166 = 82.969

    Didn't know what was better to do... have tried to write everything I had.
    EhEH...

    Ciao!

  28. #108
    Alive and XXXXing
    Join Date
    Nov 2003
    Location
    GMT +3
    Posts
    55
    P4C 2.4 @ 3.0, (800 MHz FSB oc'd -> 1000)
    512 MB PC3200 CL3 noname RAM (Samsung chips)

    One moment, opening rotamer library...
    Predicting secondary structure and generating trajectory distribution...
    Folding protein...
    Benchmark complete.

    Summary
    -------
    Usr time Sys time
    -------- --------
    Maketrj 7.156 0.484
    Foldtraj 31.156 8.563

    This result is with memory running synchroniously (250 MHz DDR = 500 MHz, 1 GHz FSB) but with very loose timings 3-4-3-6. Even so, my mem voltage is at 2.95 volts. PAS is set to "Ultra Turbo" (fastest).

    With Memory running asynch at native PC3200 speeds (200 MHz DDR = 400 MHz, 1 GHz FSB) I can set timings to 2-2-2-5, but the machine runs a tad slower, giving something around USR = 33 SYS = 8.7

    System is Win XP Pro.

  29. #109
    Running on an AMD Opteron 240 with 4gb ECC/Registered DDR333
    Linux 2.6.0 SMP 32-bit NUMA Optimised

    ---
    One moment, opening rotamer library...
    Predicting secondary structure and generating trajectory distribution...
    Folding protein...
    Benchmark complete.

    Summary
    -------
    Usr time Sys time
    -------- --------
    Maketrj 4.250 0.850
    Foldtraj 36.880 11.280


    ---

    If would appreciate if anyone knows how to configure the benchmark to run on 2 processors simultaenously.

    I ran 2 benchmarks in 2 separate windows "almost" simultaneously (press enter, switch to another window, press enter again) I achieve roughly the same output as above.

  30. #110
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    The way you describe is how I do it. Just swap windows and run the second

    What Client was that tested on. The regular or one of the Test Clients. It is very fast for a 140. Almost as fast as a 3200 Barton @ 200 FSB

    Damn, I am saving up now, forget the Athlon64 3000, I want a Opteron Duallie after all

    O yeah, is it the Iwill MB, and what video card is it running etc etc
    Last edited by Grumpy; 12-19-2003 at 06:30 AM.
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  31. #111
    Please note that for the recent beta clients, and the new one being released now (as indicated in the whatsnew.txt), the benchmark is no longer comparable to past benchmarks, due to the changes made to the algorithm. Interestingly, the new benchmark can show how much the algorithm has been sped up compared to the old algorithm. Please don't base hardware decisions on old vs. new benchmarks therefore
    Howard Feldman

  32. #112
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    Yeah, that is why I asked for the Client Version. But it would have to have been done with the 108 I imagine, so 37 is very fast for the 140 all the same. It appears Linux 64 is running the Client a lot faster than Linux 32 Bit, even without a recompile

    Umm, what precision are the numbers running at with the Client Howard...64, 72
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  33. #113
    Ooops, my apologies, forgot to mention thats using the new client so yeah, the numbers aren't that great... I'm more interested in how well it scales in SMP for NUMA vs non-NUMA which is why I asked if theres a better way to run 2 clients simultaneously.

  34. #114
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    I get 36 seconds for Foldtraj with the new updated Client on a NF2 MB and Barton @ 2275 Mhz, so if the 240 1.4 Ghz Opteron gets close to this, I am very very impressed with your configuration
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  35. #115
    Originally posted by Grumpy
    The way you describe is how I do it. Just swap windows and run the second

    What Client was that tested on. The regular or one of the Test Clients. It is very fast for a 140. Almost as fast as a 3200 Barton @ 200 FSB

    Damn, I am saving up now, forget the Athlon64 3000, I want a Opteron Duallie after all

    O yeah, is it the Iwill MB, and what video card is it running etc etc
    ya, thats using the Iwill DK8SL with on-board ATi RageXL video... its not the DK8X workstation board unfortunately.

    I've noticed quite a bit of performance improvement 2 days ago when I switched to 2.6.0 NUMA optimised... I think the difference is NUMA vs non-NUMA on the Opteron since the client is compiled in 32-bit so using a 64-bit kernel won't net any real benefit unless the calls to system libraries benefit from 64-bit in some way

    Since each client uses up to 150mb of RAM, then theres real benefits to be had with NUMA

  36. #116
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    If the numbers being crunched are 64 bit precision, then it will make a heck of a difference as it can run it native and noyt have to emulate 64 bit precision

    So it is a dual 240 system and is the MB have the ram shared so cpu 2 goes through cpu1 for memory ?
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  37. #117
    No, it has 8 ram slots, 4 per processors in a 4+4 configuration (Iwill doesn't make castrated motherboards in a 4+0 configuration)

    I use 4 sticks of 1GB ECC/Registered DDR333, so 2 sticks per processor with both processors running in 128-bit memory path.

    If CPU1 has to go through CPU0 for ram then NUMA optimisations means sqat

    True if the client uses double precision floating point computations, then compiling for x86-64 "may" see quite a significant improvement... it comes down to 32-bit vs 64-bit although if the Ultrasparc numbers aren't anything to write home about, going to 64-bit may not be an improvement if any.... you'll probably have to hack the code somewhere since a "double" on a 64-bit architecture means 128-bit precision if your only aiming for 64-bit then you're doing more work than you need to.

  38. #118
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    Hmmm, the last Opteron 140 benchmark someone post my Barton was 32 seconds and the 140 46 seconds in the Fold Benchmark
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  39. #119
    Senior Member
    Join Date
    Jul 2003
    Location
    Hamburg/Germany
    Posts
    386
    These are the benchmarks for the [b]new[/] client on a P4 2.66Ghz with 512MB PC1066 Rambus and Win2K:


    One moment, opening rotamer library...
    Predicting secondary structure and generating trajectory distribution...
    Folding protein...
    Benchmark complete.

    Summary
    -------
    Usr time Sys time
    -------- --------
    Maketrj 8.582 0.340
    Foldtraj 34.329 9.113


    I think thats a pretty good score...

    can anybody else post some new benchmarks?


    Greets thor

  40. #120
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    Here is my Dual 2400 MP

    • Summary
      -------
      Usr time Sys time
      -------- --------
      Maketrj 9.266 0.484
      Foldtraj 59.859 11.875


    And my AMD Barton @ 2275 Mhz

    • Summary
      -------
      Usr time Sys time
      -------- --------
      Maketrj 6.156 0.234
      Foldtraj 36.578 6.703
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

Page 3 of 5 FirstFirst 12345 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •