Getting the most out of your quad core CPU [Archive]

View Full Version : Getting the most out of your quad core CPU

DOSGuy

01-17-2008, 11:35 AM

Dual core has been around for a few years, and it looks like quad core could become standard in the near future. It seems like a good time to talk about how to take advantage of this unprecedented processing power.

I have a Q6600, so I figured that I must have enough spare CPU resources left between the four cores to run a fifth client. When I was running one client on each core, I was getting 3M per client, so that's my baseline. I then ran:

sobsvc -p:5:3:0:0:0:0:1

Just to recap, this tells SB to run 5 clients, and the "3" says that I'm going to select the affinity for each client. The first four clients have affinity "0", meaning they get their own core, and the fifth client is affinity "1", meaning that it's allowed to run on all of the CPUs.

My hope was that it would it would use any spare power than any of the four CPUs had left, but the built in problem is that it can only run on one CPU at a time, so even if all four CPUs have some resources left, only one CPU will actually get to run the fifth client. In theory, it should ensure that the fifth client always gets some time, but it would require 8 clients to truly use all of the spare resources of each core, and if I was going to run 8 clients, why not just use affinity 1 or 2?

The net result was that first four clients continued to run at the same speed, and the fifth client ran between 500K and 800K. Basically, there was practically no improvement over running four clients.

Still convinced that four cores must have enough spare power to get decent performance across five clients, I tried:

sobsvc -p:5:1

This means that all five clients can use whichever CPU happens to be available. This seems like it would always be the best solution for any number of clients, because they'll each grab whatever CPU time is available. The result was kind of weird.

My first four clients lost an average of 350K each, but my fifth client jumped to 2M. I lost 1.4M but gained 1.5M, so there was a gain of about 100K over the 0:0:0:0:1 method, for a total gain of about 600K over just having four clients.

I'm not really sure why the performance isn't equal across all five clients. It looks like the first four are generally getting a core to themselves, and the fifth one is bouncing around between all of the cores. The fifth client steals a bit of performance from the other four, but all four cores are always running one client, and running a second client 1/4 of the time.

It still seems like the ideal way to maximize each core is to run eight clients, but it looks like there's almost nothing left for the second client on each core to pick up. You might get an extra few hundred K out of each core, but there's still very little benefit from running 5, 6, 7 or 8 clients compared to just running 4.

So, I have nothing exciting to report, but I wanted to share my research. Maybe others will share how they maximize performance on their quad core CPUs, and hopefully bring some new ideas to the table. The possibilities will become more interesting when Penryn comes out, which will bring back hyperthreading: the ability to run a second thread on each core using any leftover resources that the first thread isn't using. Theoretically, to maximize the potential of hyperthreading, Intel will duplicate some of the CPU resources so that the second thread can be used more often, even when it wants the same resources as the first thread.

engracio

01-17-2008, 12:42 PM

Great post, thanks for the info. One of this days a quadie will be crunching in my herd.:)

e

vjs

01-17-2008, 05:32 PM

DOSGuy,

Thanks for the work in trying to optimize that processor, it's something we tried in the past with the P4's and HT. I'm not supprised that E chimed in considering he is running those dual processor Xeons with HT :thumbs:.

E I'm looking forward to you getting that q6600 at this point and with the recent news from Intel not releasing processors until November... Now is really the time to look at getting a q6600. 260 for a processor a P35 board for about 80 and 2G of ddr2-800 you can't go wrong.

I currently have a q6600 intially I had clocked the living day lights out of it to 4.0GHz. At 4Ghz I had random crashes which went away entirely by 3.8Ghz.
But trust me people it appears very stable but it's not, I had to back mine down to 3.4Ghz and even now I'm not certain. If your going to overclock be sure your stable... once your prime95 stable take away 200 Mhz.

On the q6600 and the P35 boards your best bet is to run fairly even multipliers without going crazy on your memory.

Currently I'm running 425 mhz 8x multiplier 850 memory. That is much more stable than 9x at 377 mhx 944mhz memory, even though my memory is rated at 1033 mhz.

DOSguy, you might actually want to look at running two instances of the client one to CPU1 the other CPU2, then run two instance of the sieve client. It's another aspect of the project that you might want to look at, currently we still need sievers. And sieve really rocks on the quad cores.

I know this is not an overclocking forum, but I'd strongly suggest that everyone with a q6600 download a couple programs.

Coretemp.exe to measure your processor temperature and make sure your in the mid 50's on temperature or less at full load if your not.... your not stable.... period.... ok lots of periods........

Second if you have a p35 board and a q6600 with decent memory try running the following.

set fsb to 400
set multiplier to 7x
set memory multiplier to 2x
up your voltage to 1.35V

If everything is good and your temps are low (less than 55C) underfull load.
Then test test test test with prime 95 for stability. The result is at least a 17% increase in performance.

Joh14vers6

01-18-2008, 06:36 AM

DOSGuy,

But trust me people it appears very stable but it's not, I had to back mine down to 3.4Ghz and even now I'm not certain.

If everything is good and your temps are low (less than 55C) underfull load.
Then test test test test with prime 95 for stability. The result is at least a 17% increase in performance.

That is exactly how my Q6600 is running nowadays.

DOSGuy

01-18-2008, 10:06 AM

I'm cooling with a Scythe Infinity with a 120mm SilenX fan (72 CFM @ 14 dBA) and using a very conservative overclock to keep the CPU temperature in the 40 to 45C range because I'm concerned about "hot spots". When the average temperature of the CPU is 50C, there can be places on one of the four cores that are considerably hotter.

I tested the heck out of it with Prime95 to be sure that my results would be reliable, but I'm having to slowly reduce the overclock a few MHz at a time. It seems to be really important to check the heatsink regularly to remove any dust buildup, and it might be a good idea to reapply thermal grease once a year or so. I don't know if it's really necessary (especially if you're using quality grease like Arctic Silver), but I know that the thermal grease can actually become an insulator if it bakes on.

vjs

01-18-2008, 10:00 PM

DosGuy,

Yup that's why you should look at coretemp.exe it will give individual cores.

I'm running the thermaltake U-120 extreme with two 120mm fans in push pull configuration. Case temperature is 26C, don't know the cfm but I'm sure it's over 100.

I lapped the processor but not the heatsink. As for thermal compound, tried everything, artic ceramic, antec silver, best one was the thermaltake stuff. Should not have to reapply heatsink compound unless your rocking the heatsink off the CPu or something weird to crack the interface ( I guess baking on would cause air pockets..).

Most thermal compound gets better with time not worse. :confused: I guess i always take the heatsink off before a year is up, interesting.

DOSGuy

01-19-2008, 09:49 AM

Most thermal compound gets better with time not worse.

That's good to know. As a former computer technician, I've pulled a lot of heat sinks off of CPUs and found baked-on thermal compound. I use quality components and upgrade every year or two, so I've never observed it in my own systems, but I've read how important it is to thoroughly clean the CPU surface before putting a new heatsink on it, because any leftover compound could bake and become an insulator.

Cleaning the dust out of the heatsink is definitely good advice, and probably explains why CPUs tend to get hotter as time goes on.

vjs

01-19-2008, 10:20 AM

I guess that old white paste, I can see that stuff get bad with time and crack.

The newer silver suspension or alumina actually gets better with time. THe fine silver particles conduct heat better than the carrier oil. It is my understanding that that oil wicks out in a couple days and temps drop by 2-3 C at full load.

I also made a mistake on the heatsink its actually a thermalright U-120 extreme... These things are awesome. It dropped my friends temp bu 4C over the zalman, I dropped 7C over the rosewill. I think I'm at 1.400V at 54C now on my hottest processor 49C on the coldest. Reallly dependant on room temperature unfortunately.

hhh

01-20-2008, 05:31 AM

DOSGuy,
DOSguy, you might actually want to look at running two instances of the client one to CPU1 the other CPU2, then run two instance of the sieve client. It's another aspect of the project that you might want to look at, currently we still need sievers. And sieve really rocks on the quad cores.

And why not run 4 sieving instances directly, or even more, if that helps? It's so much more efficient, project-wise...

H.

vjs

01-21-2008, 08:19 PM

Running one instance of sieve per core is the most effective.

riptide

04-21-2008, 03:38 AM

Just scanning through this thread....

First of all this is the service line you need. sobsvc -p:4:0

That will peg 4 clients each to one core. You don't want anymore clients that will split the process across the native dual cores (Intel 65nm Quad = x2 native Dual Cores) as this may impact on cache etc.
One client per core on these quads. I've been running a few for quite a while now ever since the QX6700 came out.

Secondly, memory bandwith and to a lesser extent latency mean everything to performance. Upping the memory by a divider, if its up to the task, can be nearly as good as having that 5th invisible core to a quad.

Ever notice on a dual or a quad, with the 2 or 4 clients open? What happens when you stop one client? The others start to race faster. Instinct tells me thats because all bandwith has become available to the remaining clients. it also tells me that there wasn't enough bandwith to begin in the first place. Not even for the dual cores with only 2 clients!! Never mind the 4 cores. So. I'll say it again. Memory Speed is king!

CPU speed is a given. More GHZ. More speed. But heres a bit of advice. If you have to risk 100-200Mhz on CPU to get that memeory divider up to 1066 or more? ... Do it! You'll more than make up for teh loss.

I've said enough. :jabber::)

Neil

plonk420

06-14-2008, 11:48 PM

i used sobsvc -p:4:0 and i'm sitting happily at 100% cpu on a phenom.i can't wait to see if the phenom actually does well at something other than just video (x264) encoding ;)

riptide

07-17-2008, 12:37 PM

How did the phenom get on?

PCZ

07-18-2008, 08:22 AM

Have to agree with riptide here
Memory Bandwidth is the Key to good performance.
Trying to run more than one client per core is futile.
Lower the multiplier and up the FSB if you want more speed.