Log in

View Full Version : Dual Processor/64 Bit Support?



ZorbaTHut
12-23-2005, 01:41 PM
I'm having a lot of trouble tracking this down - I've got SoB running on a dualcore 64-bit system, and it's using, strangely, 63% of my computer. So it's taking *some* advantage of the dualprocessor, but not full advantage. Any suggestions on how to make it use 100%? Can I just run two clients?

Also, are there any plans to make a 64-bit version? I doubt the 64-bitness alone would help much, if any, but the added registers available on 64-bit chips might be significant. (Or might not.)

Matt
12-23-2005, 03:22 PM
You can use the service to get dual processor support. This was posted by vjs in another post:



C:\progra~1\sb

then

sobsvc -i <-- installs client

then

sobsvc -o2 <--- for multi-processor

then

sobsvc -m <--- keeps sb client working

ZorbaTHut
12-23-2005, 04:00 PM
Just tried that - wow, that worked spectacularly badly. I guess each client was instacrashing, and it would start another, every second or so. Stopped the service - the sobsvc.log file is full of

[2005/12/23 12:57:23.656]: Retrying...
[2005/12/23 12:57:24.156]: starting client 1
[2005/12/23 12:57:24.234]: Unable to start client, Code = 4

whereas sb.log is full of simply

[Fri Dec 23 12:57:35 2005] got k and n from cache

Service is killed and I'm back to running it normally. Any idea on how to fix this?

It's installed in a location it might not expect - c:\Program Files (x86)\SB - because I'm on winxp64. If it's got the "default installation path" hardcoded, that could cause this.

psnow
12-25-2005, 02:36 PM
It's installed in a location it might not expect - c:\Program Files (x86)\SB - because I'm on winxp64. If it's got the "default installation path" hardcoded, that could cause this.

Check with regedit the following:
in the path HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\LhDn\sob\config, you can find a key named "dir"
if the value for "dir" is C:\Program Files\SB, then change it to
C:\Program Files (x86)\SB

--
psnow

ZorbaTHut
12-26-2005, 01:44 AM
if the value for "dir" is C:\Program Files\SB, then change it to
C:\Program Files (x86)\SB

It's already set to that :/ I guess that's not the problem. I still get the issue with the service though.

Ken_g6[TA]
12-27-2005, 02:48 PM
For a 64-bit version of SoB, first the big integer library will have to get updated by George Woltman. More info here (http://www.mersenneforum.org/showthread.php?t=3924), though I don't see any progress since March.

vjs
12-27-2005, 04:36 PM
Perhaps it's a problem with spaces and ( characters ???

Try uninstall re-install, us a directory such as c:\sb ...

Also is it an an amd or pentium chip?

Dualcore-HT chip should run 3 instances. Dual core athlons two instances of course. Not sure how you got 63%?

Computing on 2 of 3 processors?

You might also want to check whats happening in taskmanager crtl-alt-del and see if the client is skipping from processor to processor.

If you read the readme.txt you can specifiy which client to run on which processor...

-----------------------
Suggestion, do the reinstall then try the following switch.

I'm going to assume that you have a dual core pentium with HT two logical and two virtual.

If so use the -p switch, can't remember 100% the exact switch but I'm pretty sure its.

-p:3:2

Three clients, the first 2 will be locked one to each of the two physical processors, the third will float on the HT's. (I'm pretty sure this is correct).

Try the above and let us know.

psnow
12-28-2005, 10:14 AM
Perhaps it's a problem with spaces and ( characters ???



I have SB working fine on "Windows XP Pro x64 edition", so I don't think the problem is the spaces or parenthesis in the path.
I am running an "AMD Athlon 64 X2 Dual Core". Running SB as service, but currently only one instance.
Like vjs recommended, I also suggest you try a new install first.

--
psnow

ZorbaTHut
12-29-2005, 03:20 AM
No luck.

Tried reinstalling, removing the service, reinstalling the service, etc. Whenever it tries to run the SB client from the service it instacrashes. sb.log says:

[Thu Dec 29 00:01:39 2005] got k and n from cache

. . . and nothing else, for each try. sobsvc.log says:

[2005/12/29 00:01:39.222]: Starting Service
[2005/12/29 00:01:39.222]: starting client 1
[2005/12/29 00:01:39.316]: Unable to start client, Code = 4
[2005/12/29 00:01:42.816]: Client Stop initiated...
[2005/12/29 00:01:42.816]: Client Stop completed
[2005/12/29 00:01:42.816]: Service Stopped

. . . and nothing else, for each try.

Turning on -m does, indeed, make it restart crashed clients - in fact it'll keep restarting them, several times a second, until I give up and tell it to stop (by turning the service off, of course.) The other settings don't seem to have anything to do with this (well, if I tell it to run more clients than 1, it does try to, and they all crash. Running one client doesn't improve matters.)

I've got it installed at c:\data\sb for now, and that doesn't seem to have changed anything. I've tried uninstalling it, I've tried reinstalling it, I've dug out where it is in the registry and tried clearing that (and got it into a state where it wouldn't run - there's stuff in the registry that the installer creates that it can't self-create - but a quick reinstall fixed that.)

And yet, nothing I do can keep it running if it's started from the service. I can't even get the error it crashes with - I presume there's *some* error, I just don't know what.

Oh, the 63% thing - a single sb.exe process is often using above 50%. I haven't seen that number again, it seems to hover around 54% now. There's no less than 4 threads in the client, so I presume that between them they're using slightly more than a single CPU.

Also, I've got an Athlon 64 X2 like psnow's. However, the sb.exe service just plain doesn't seem to work. At all. I can run the client manually, but the service? Instant failure.

vjs
12-29-2005, 01:05 PM
That's extremely odd, It sounds like you have admin rights and the registry isn't locked by any program or priv since you can access it etc...

I don't know of any fixes for this and I've never seen this before.
What you might try however is breaking the %CPU display into two graphs one for each CPU. And seeming which processor is being tasked.

I've taken it that you have read the readme.txt with no success.

What I'm thinking is that you could try specifing one instance on the second CPU and see if you can move the process with the switches.

Under the ldnh??? or whatever reg-key it is there should be a 0 and 1 or 1 and 2 (can't remember). Is this being created?

The easy out here is to simply run one instance of the sb.exe client and then run prime95 factoring or sieve. I know it's not a fix but those two sub-projects could always use the extra help.

--------------------------

As for the sb.exe client taking more than 50% I don't think that's possible. I'm assuming that your reading the total consumed CPU? If not and the sb.exe task is actually consuming more than 50%???? That's got me, might want to check for spy-ware or something...

ZorbaTHut
12-29-2005, 02:38 PM
That's extremely odd, It sounds like you have admin rights and the registry isn't locked by any program or priv since you can access it etc...

I don't know of any fixes for this and I've never seen this before.
What you might try however is breaking the %CPU display into two graphs one for each CPU. And seeming which processor is being tasked.

They both are, pretty equally - the process is obviously bouncing between them. (Good ol' Windows.)


I've taken it that you have read the readme.txt with no success.

What I'm thinking is that you could try specifing one instance on the second CPU and see if you can move the process with the switches.

I can easily move it back and forth by changing the affinity in the task manager. Since starting the client via the services doesn't work, I can't make the changes there and see what happens.


Under the ldnh??? or whatever reg-key it is there should be a 0 and 1 or 1 and 2 (can't remember). Is this being created?

I've got LhDn/sob/cache and LhDn/sob/cache2, although curiously only cache2 has any keys in it (it's doing what my current single manually-run client is doing.)


The easy out here is to simply run one instance of the sb.exe client and then run prime95 factoring or sieve. I know it's not a fix but those two sub-projects could always use the extra help.

Factoring and sieving both seem to require far more user intervention than I want. Really, I'd rather get this working, but if this won't work I'll just leave a CPU idle :P Laziness trumps.


As for the sb.exe client taking more than 50% I don't think that's possible. I'm assuming that your reading the total consumed CPU? If not and the sb.exe task is actually consuming more than 50%???? That's got me, might want to check for spy-ware or something...

http://img426.imageshack.us/img426/4483/53percent5lg.gif

It's possible, the sb.exe client is multithreaded. I presume one of them is the GUI or some other maintenance process (could be garbage collection for all I know, I don't know how it's coded.) I don't have any convenient way to look at the process on a per-thread basis, but I bet I'd see one thread running at 50% and one occasionally giving small spikes of 2% to 5%.

Does anyone know if there's a way to increase the verbosity of the logs? It's quitting for a reason, and if I could figure out what that reason was, I might be able to fix it. A while back I turned off a lot of non-essential services on this computer - it might be trying to use one of those (although if it is, it should have a dependency listed in the service manager), for example.

vjs
12-29-2005, 04:42 PM
They both are, pretty equally - the process is obviously bouncing between them. (Good ol' Windows.)

This should be solved by using the following switch

sobsvc -p:2:0

(assigns two instances one to each processor)

I also wouldn't mess with the affinity.



I've got LhDn/sob/cache and LhDn/sob/cache2, although curiously only cache2 has any keys in it (it's doing what my current single manually-run client is doing.)

That's what I was thinking. I'm wondering if the following is the issue.
(shooting in the dark here).

Since cache1 has no keys and your only running one instance perhaps it keeps looking for the k/n values cache1 and can't find any? Generally when you start one instance it only uses cache1.

This could also explain the 53% issue. I know some time back the client would to stop processing when it was transmitting results, this was a loss in production during server communication. This portion was multi-treaded so that it could continue to process while it was talking...

So the client is consuming 50% and the communications is consuming 3%. This makes sence since it's continously talking and trying to download a k/n pair which it can never write into the registry since there are no keys or ??"key folders"??

Worth a shot, create identical registry entires in cache1 and cache2.

run

sobsvc -p:2:0

Also I think there was an issue with running multiple switches at the same time. Example if your going to us switch -m -x etc.

Do
sobsvc -p:2:0
sobsvc -m
sobsvc -x

not

sobsvc -p:2:0 -m -x

Give that a try and let us know.

If it works let it process the duplicate, then see what it downloads.

Let us know,

Does anyone know what code 4 is I bet it's some registry fault with the client. I.E. k/n pair not found in reg.

I'll cross my fingers.

ZorbaTHut
12-30-2005, 01:44 AM
fiddling with sobsvc

No workie. Identical behavior.

I'm seriously thinking the only way this'll be fixed is to get some better debug output from the client. ;)

(edit) Oh, and fiddling with the registry did nothing either :/ I ended up deleting all the stuff I'd created just to avoid mucking with it further. However, my progress does show up on the stats website . . . albeit really, really strangely. See http://www.seventeenorbust.com/stats/users/user.mhtml?userID=10499 - I don't understand why it's gradually slowing (the first section is when I *did* have dual processor working - I'd just ran two manually, and I suspect they'd both gone and gotten work units. Now when I run a second one, it just picks up the same work as the first one is doing) and I don't understand what that giant 70M peak is. Or any of the other peaks or weird behavior. Very, very confused, honestly.

vjs
12-30-2005, 01:50 PM
I don't know what to tell you I've tried searching the forums for the code = 4 and can't find anything. I'm sure louie or dave know why don't you pm alien88 just ask him what code 4 means.

I guess your best bet is to simply use one client for now until someone figures it out. In the mean time try the sieve or p-1 client. It's a little more manual but it's really not that bad at all. It's also addictive... alot of people at TPR are using these clients, and have pushed ARS into 1st/2nd in that section. It's basically a once a month deal, reserve a range, leave it alone. Come back later etc etc.

Ken_g6[TA]
12-30-2005, 03:41 PM
SoB gets its K and N values from a key in the registry designated by a value in the registry. The value that designates the key is "HKEY_LOCAL_MACHINE\SOFTWARE\LhDn\sob\config\ClientKey". If you run two SB instances, and the second one sees that one has been running, it will pick up the same K and N. The way the service is supposed to work is that it starts one instance, changes that value, starts another instance, and changes the value back! :looney:

You could try doing that procedure manually and see if (1) it works, or (2) you can see what fails.

Matt
12-30-2005, 08:57 PM
I guess your best bet is to simply use one client for now until someone figures it out. In the mean time try the sieve or p-1 client. It's a little more manual but it's really not that bad at all. It's also addictive... alot of people at TPR are using these clients, and have pushed ARS into 1st/2nd in that section. It's basically a once a month deal, reserve a range, leave it alone. Come back later etc etc.

I can agree with this, I was sceptical about doing sieving at first, now I'm seriously addicted to it and have converted most of my machines to sieving!

ZorbaTHut
12-31-2005, 12:13 AM
']SoB gets its K and N values from a key in the registry designated by a value in the registry. The value that designates the key is "HKEY_LOCAL_MACHINE\SOFTWARE\LhDn\sob\config\ClientKey". If you run two SB instances, and the second one sees that one has been running, it will pick up the same K and N. The way the service is supposed to work is that it starts one instance, changes that value, starts another instance, and changes the value back! :looney:

. . . Please tell me you're joking.

. . . you're not joking.

Augh.

That's one of the worst designs I've seen in a very, very, very, very long time. Why can't it at least be a command-line parameter? Seriously! Oy!

Well, it works if I do it manually. Clearly starting the client from the service is the only part that's broken. Since this machine rarely gets turned off or rebooted, I'll just let it run like this, I suppose.

Yeesh. Well, thanks for the info, at least it works now :P

psnow
12-31-2005, 07:33 AM
']The value that designates the key is "HKEY_LOCAL_MACHINE\SOFTWARE\LhDn\sob\config\ClientKey".

In WinXP x64 the key is actually in:
"HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\LhDn\sob\config\ClientKey"

Under this "Wow6432Node" you can find all 32-bit programs.
But that should be no problem as windows should handle this automagically.

As I have similar configuration I tried also running two instances but they seem to work fine (CPU Usage 100%).
Before I started the service the ClientKey had value: "Software\LhDn\sob\cache"
and afterwards: "Software\LhDn\sob\cache2"

My sobsvc parameters are:

C:\Program Files (x86)\SB>sobsvc -d
CPU Affinity = Separate, 2 Clients
AutoRestart OFF
TrueIdle = TRUE
NormalizeXP = 40da34
Monitor and Restart = FALSE
Restart Stuck Clients = FALSE
Keep Icons Visible = TRUE
WU Queue Size = 0
Autodial Type = 0
Periodic Restart = 0

from my sobsvc.log:

[2005/12/31 13:54:19.328]: Parms retrieved: NumClients = 2, AffType = 0,
WUQueue = 0, PeriodicRestart = 0, AutodialType = 0
[2005/12/31 13:54:19.328]: TrueIdle = TRUE, MonitorRestart = FALSE, StuckRestart = FALSE, KeepVisible = TRUE
[2005/12/31 13:54:19.328]: NormalizeXP = FALSE
[2005/12/31 13:54:19.328]: AutoRestart OFF
[2005/12/31 13:54:19.328]: ** SB Client version 1.10+ **
[2005/12/31 13:54:19.328]: Starting Service
[2005/12/31 13:54:19.328]: starting client 1
[2005/12/31 13:54:19.734]: Client start successful: 0xb0 (0x1118) - 0x1b0ef2 (0x90ff0)
[2005/12/31 13:54:19.734]: starting client 2
[2005/12/31 13:54:20.093]: Client start successful: 0xb8 (0x1370) - 0x90f6e (0x80f58)

and sb.log:

[Sat Dec 31 13:54:19 2005] connecting to server
[Sat Dec 31 13:54:19 2005] connecting to server
[Sat Dec 31 13:54:20 2005] logging into server
[Sat Dec 31 13:54:21 2005] logging into server
[Sat Dec 31 13:54:21 2005] requesting a block
[Sat Dec 31 13:54:21 2005] requesting a block
[Sat Dec 31 13:54:21 2005] got proth test from server (k=19249, n=6049778)
[Sat Dec 31 13:54:21 2005] AMD Athlon(tm) 64 X2 Dual Core Processor 4800+ detected. Enabling cpu specific optimizations.
[Sat Dec 31 13:54:21 2005] got proth test from server (k=55459, n=6049786)
[Sat Dec 31 13:54:21 2005] AMD Athlon(tm) 64 X2 Dual Core Processor 4800+ detected. Enabling cpu specific optimizations.


So I was not able to reproduce ZorbaThut's problem.

--
psnow