DOSGuy
01-17-2008, 11:35 AM
Dual core has been around for a few years, and it looks like quad core could become standard in the near future. It seems like a good time to talk about how to take advantage of this unprecedented processing power.
I have a Q6600, so I figured that I must have enough spare CPU resources left between the four cores to run a fifth client. When I was running one client on each core, I was getting 3M per client, so that's my baseline. I then ran:
sobsvc -p:5:3:0:0:0:0:1
Just to recap, this tells SB to run 5 clients, and the "3" says that I'm going to select the affinity for each client. The first four clients have affinity "0", meaning they get their own core, and the fifth client is affinity "1", meaning that it's allowed to run on all of the CPUs.
My hope was that it would it would use any spare power than any of the four CPUs had left, but the built in problem is that it can only run on one CPU at a time, so even if all four CPUs have some resources left, only one CPU will actually get to run the fifth client. In theory, it should ensure that the fifth client always gets some time, but it would require 8 clients to truly use all of the spare resources of each core, and if I was going to run 8 clients, why not just use affinity 1 or 2?
The net result was that first four clients continued to run at the same speed, and the fifth client ran between 500K and 800K. Basically, there was practically no improvement over running four clients.
Still convinced that four cores must have enough spare power to get decent performance across five clients, I tried:
sobsvc -p:5:1
This means that all five clients can use whichever CPU happens to be available. This seems like it would always be the best solution for any number of clients, because they'll each grab whatever CPU time is available. The result was kind of weird.
My first four clients lost an average of 350K each, but my fifth client jumped to 2M. I lost 1.4M but gained 1.5M, so there was a gain of about 100K over the 0:0:0:0:1 method, for a total gain of about 600K over just having four clients.
I'm not really sure why the performance isn't equal across all five clients. It looks like the first four are generally getting a core to themselves, and the fifth one is bouncing around between all of the cores. The fifth client steals a bit of performance from the other four, but all four cores are always running one client, and running a second client 1/4 of the time.
It still seems like the ideal way to maximize each core is to run eight clients, but it looks like there's almost nothing left for the second client on each core to pick up. You might get an extra few hundred K out of each core, but there's still very little benefit from running 5, 6, 7 or 8 clients compared to just running 4.
So, I have nothing exciting to report, but I wanted to share my research. Maybe others will share how they maximize performance on their quad core CPUs, and hopefully bring some new ideas to the table. The possibilities will become more interesting when Penryn comes out, which will bring back hyperthreading: the ability to run a second thread on each core using any leftover resources that the first thread isn't using. Theoretically, to maximize the potential of hyperthreading, Intel will duplicate some of the CPU resources so that the second thread can be used more often, even when it wants the same resources as the first thread.
I have a Q6600, so I figured that I must have enough spare CPU resources left between the four cores to run a fifth client. When I was running one client on each core, I was getting 3M per client, so that's my baseline. I then ran:
sobsvc -p:5:3:0:0:0:0:1
Just to recap, this tells SB to run 5 clients, and the "3" says that I'm going to select the affinity for each client. The first four clients have affinity "0", meaning they get their own core, and the fifth client is affinity "1", meaning that it's allowed to run on all of the CPUs.
My hope was that it would it would use any spare power than any of the four CPUs had left, but the built in problem is that it can only run on one CPU at a time, so even if all four CPUs have some resources left, only one CPU will actually get to run the fifth client. In theory, it should ensure that the fifth client always gets some time, but it would require 8 clients to truly use all of the spare resources of each core, and if I was going to run 8 clients, why not just use affinity 1 or 2?
The net result was that first four clients continued to run at the same speed, and the fifth client ran between 500K and 800K. Basically, there was practically no improvement over running four clients.
Still convinced that four cores must have enough spare power to get decent performance across five clients, I tried:
sobsvc -p:5:1
This means that all five clients can use whichever CPU happens to be available. This seems like it would always be the best solution for any number of clients, because they'll each grab whatever CPU time is available. The result was kind of weird.
My first four clients lost an average of 350K each, but my fifth client jumped to 2M. I lost 1.4M but gained 1.5M, so there was a gain of about 100K over the 0:0:0:0:1 method, for a total gain of about 600K over just having four clients.
I'm not really sure why the performance isn't equal across all five clients. It looks like the first four are generally getting a core to themselves, and the fifth one is bouncing around between all of the cores. The fifth client steals a bit of performance from the other four, but all four cores are always running one client, and running a second client 1/4 of the time.
It still seems like the ideal way to maximize each core is to run eight clients, but it looks like there's almost nothing left for the second client on each core to pick up. You might get an extra few hundred K out of each core, but there's still very little benefit from running 5, 6, 7 or 8 clients compared to just running 4.
So, I have nothing exciting to report, but I wanted to share my research. Maybe others will share how they maximize performance on their quad core CPUs, and hopefully bring some new ideas to the table. The possibilities will become more interesting when Penryn comes out, which will bring back hyperthreading: the ability to run a second thread on each core using any leftover resources that the first thread isn't using. Theoretically, to maximize the potential of hyperthreading, Intel will duplicate some of the CPU resources so that the second thread can be used more often, even when it wants the same resources as the first thread.