Page 1 of 2 12 LastLast
Results 1 to 40 of 45

Thread: New exponents are crashing the client [FIX: SB v1.1.1]

  1. #1

    New exponents are crashing the client

    I have a few boxes that returned completed results today. But when the client starts work on the new downloaded exponent, the client will always crash.

    I tried removing the service install and the -m, -s switches and try to start up the client manually, but it still crashed.

    I also tried uninstalling the client and reinstalling it in a different directory, but the problem still persisted. It is able to download a new exponent, but it will always crash when it starts working on the exponent.

    As a last resort, I copied a work in progress from 1 of my other machines and edited the registry accordingly. Restarted the client and the client started crunching away. So the problem seems to be bad WUs being handed out.

    So far I have 4 boxes that have stopped crunching SoB and at least 2 other teammates are experiencing the same problem.

    All my boxes are running on WinXP. They have been ruuning the client for the past month with no problems until today.

    Anyone else experiencing the same problem? Anybody knows what is going on?
    Last edited by kenlow; 10-30-2003 at 12:03 AM.

  2. #2

  3. #3
    Senior Member eatmadustch's Avatar
    Join Date
    Nov 2002
    Location
    Switzerland
    Posts
    154
    after what n do you get this?
    EatMaDust


    Stop Microsoft turning into Big Brother!
    http://www.againsttcpa.com

  4. #4
    Originally posted by eatmadustch
    after what n do you get this?
    They are all in the 498xxxx range.

  5. #5
    Same problem. Running XP on a 2.4 P4

  6. #6
    Moderator ceselb's Avatar
    Join Date
    Jun 2002
    Location
    Linkoping, Sweden
    Posts
    224
    Please state your CPU and OS, that might help to locate the problem.

    I'm seeing a few of these reports in the ars technica forum aswell.

  7. #7
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    This is exactly the same problem we've had with the P-1 factoring client. Sorry to say we didn't have a solution. Just from 498xxxx onwards it works on some machines, but not on others.

    As ceselb suggests, if everyone with this problem lists their PC spec this should give Louie some clues.

    I'll drop Louie an e-mail in a few minutes.

  8. #8
    Hater of webboards
    Join Date
    Feb 2003
    Location
    København, Denmark
    Posts
    205
    This is around the same point as Nuri found to be an upper bound (4980670) for p-1 factoring on his P4 running windows (some version).

    Yet another reason to drop windows?

  9. #9
    Moderator ceselb's Avatar
    Join Date
    Jun 2002
    Location
    Linkoping, Sweden
    Posts
    224
    Yeah, I was thinking the same thing, but the SoB client uses different code afaik.
    The same code is used by GIMPS, so they *should* have spotted any errors.

  10. #10
    is this just the new version or does this happen in the previous releases too?

  11. #11
    Moderator ceselb's Avatar
    Join Date
    Jun 2002
    Location
    Linkoping, Sweden
    Posts
    224
    I tried the 31337 account, it crashes too.

    SB v1.1 P4 running w2k.

  12. #12
    Hater of webboards
    Join Date
    Feb 2003
    Location
    København, Denmark
    Posts
    205
    Originally posted by ceselb
    I tried the 31337 account, it crashes too.

    SB v1.1 P4 running w2k.
    I just tried on two different machines, it works on both:
    SB v1.02 PentiumMMX running Linux.
    SB v1.02 Pentium III running Linux.

    I have no intentions of completing those 31337 tests, but they don't show up under 'Current pending tests', so I can't expire them. Guess I just have to hope regular prp'ing doesn't catch up with those within the next 10 days. But that would leave me quite :shocked:

  13. #13
    Moderator ceselb's Avatar
    Join Date
    Jun 2002
    Location
    Linkoping, Sweden
    Posts
    224
    Tried v1.1 on an old PIII-700, works fine (seems to hang for a minute, but then starts to work).

    Anybody got a P4 on windows that does work?

  14. #14
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    A suggestion for anyone who can't crunch units right now.

    If you change your username in the client to "secret", you'll be testing numbers that were tested by previous searchers (before SoB), but who haven't provided residues (the proof that the tests have been done).

    These number are quite small (n=675000), so they will complete quite quickly, and therefore might be best suited to users with a permanent internet connections. The chance of finding a prime is remote but not impossible, and it won't do your personal stats any good, but it will help the SoB project.....and it's only until the client's fixed.

  15. #15
    yeah, that is not a bad suggestion.

  16. #16
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    good idea Mike.

  17. #17
    This may be sheer coincidence, but 1.5e6/log(2) = 4982892

    So if there's some limit to windows or to the client at 1.5 million digits, that would show up for n around 498xxxx

    (my 1.0 client on a P4 running linux keeps crashing and hanging, but it has been doing that all along and doesn't seem to need a restart any more often than before:

    Mon Oct 20 - 2 times
    Tue Oct 21 - 3 times
    Wed Oct 22 - 1 time
    Thu Oct 23 - 5 times
    Fri Oct 24 - 2 times
    Sat Oct 25 - 2 times
    Sun Oct 26 - 1 time
    Mon Oct 27 - 5 times
    Tue Oct 28 - 4 times
    Wed Oct 29 - 2 times
    Thu Oct 30 - 2 times so far

    Thanks to sbwrap for the log and the restarts.)

  18. #18
    Junior Member
    Join Date
    Aug 2003
    Location
    Concord, CA
    Posts
    15
    So the problems seem to occur on Opterons and P4 systems. Does this code have SSE2 optimizations? That seems to be the first link that pops out at me.

    I have Xeons running just fine in Linux at 499+, so some problem with Windows + SSE2, perhaps?

  19. #19
    Sieve it, baby!
    Join Date
    Nov 2002
    Location
    Potsdam, Germany
    Posts
    959
    The client is heavily SSE2 optmized.
    I wouldn't be surprised if the crash limit of SB and SBfactor would be exactly the same. I recall Louie saying that both are "based" on the same code - maybe using the same maths libraries?

  20. #20
    Add another P4/2.6C Windows XP Rig with 2X512 Kingston HyperX PC3500 to the list that will not run SoB client. I have been running just fine up until it started trying to start with n-498XXX and up.

    I have also tried running the client in compatability mode using Win2K, Win98, etc. Not the best test senerio for testing WinXP for being the problem but it was just a thought I had and tried.

  21. #21
    I think the problem is related to P4 systems.

    Just got 2 new proth tests on 2 of my AMD Win2K rigs and they are running fine.
    got proth test from server (k=33661, n=4997112)
    got proth test from server (k=19249, n=4999082)

    Still can not run on my P4/2.6C WinXP rig

  22. #22
    I've been able to successfully run using the secret account, on my P4/ XP pro system so it something to do with the larger Ns...

  23. #23
    Junior Member
    Join Date
    Oct 2003
    Location
    Chicago, IL
    Posts
    1

    add one more

    This is my log before it crashed.

    [Wed Oct 29 17:14:32 2003] got proth test from server (k=33661, n=4983408)
    [Wed Oct 29 17:15:31 2003] got k and n from cache

    I'm running Windows 2000 SP 4 on a Mobile Pentium-4 2.00 GHz.

    The error appears to be a null pointer error. "instruction at blah referenced memory at 0x00000"

    HTH,
    Brian

  24. #24
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    k=22699, n=5001382.

    P4-1700.

    And as expected, crashes.

  25. #25
    Team Anandtech
    Join Date
    Aug 2003
    Location
    New Zealand
    Posts
    50
    Have we determined if the crashing is isolated to NT/2k/XP yet? It seems that it's definitely only SSE2 enabled cpus (ie, the p4 and opteron) that are having trouble.

  26. #26
    As far as I can tell, my athlons seem to be producing fewer cem/s lately than they used to. I cannot really tell whether this is because the FFT length just made another jump (odd coincidence?) or whether there are in fact "hidden crashes" on athlons -- i.e. the current exponents kill athlons as much as P4's, just not as thoroughly, and thus the client has to be restarted by the service and produces less "long-term average" output.

    I just thought I'd mention this to people who have (only) athlons running: is your output constant or have you seen reduction in production (in terms of your actually submitted cem/sec, not in terms of the nonsense the client app displays) in the last couple days?

  27. #27
    Originally posted by allio
    Have we determined if the crashing is isolated to NT/2k/XP yet? It seems that it's definitely only SSE2 enabled cpus (ie, the p4 and opteron) that are having trouble.
    Well my other P4 2.4C/3.0Ghz rig is off SoB, also.

    All of my AMD rigs (5 total) are running just fine and they are
    running Win2K Pro and WinXP Pro. This would indicate that this
    bug is with P4 processors and not the OS.

    All my AMD rigs have completed tests and downloaded new n = 498-500
    range proth tests.

  28. #28
    Moderator Joe O's Avatar
    Join Date
    Jul 2002
    Location
    West Milford, NJ
    Posts
    643
    Originally posted by Lagardo
    As far as I can tell, my athlons seem to be producing fewer cem/s lately than they used to. I cannot really tell whether this is because the FFT length just made another jump (odd coincidence?)
    While I'm not 100% sure, I think that we just crossed or are crossing an FFT boundary. If you look at the exponent boundaries for GIMPS/Prime95 non SSE2 code, the 2 closest are 4598000 & 5255000. The Pentium 4 SSE2 code has slightly different ranges for each FFT size. You also have to allow for the "K" values. They would force us to switch to a larger FFT a little sooner than GIMPS/PRime95.
    We are having similar problems in SB P-1 factoring. If you search through the P-1 Factoring program thread, you will see that Louie fixed a similar problem for SBFactor running under LINUX. That problem was a code alignment problem. The problem was manifested only under Linux not windows, and the fix was only to the Linux not the windows program. Well this time the Linux programs work and the Windows ones don't. If the problem with the PRP client is the same as the SBFactor program, the non SSE2 machines are not affected. At least not yet. I don't know if a similar problem awaits us at the next boundary. We can only wait and see. By the way, the next non SSE2 boundaries for GIMPS/P95 are at 6520000 & 7760000.
    Joe O

  29. #29
    would be nice to see a reply from Louie or someone with access to the code that they are aware of the problem and are looking at it

    Slatz

  30. #30
    I think I may know what the problem is: the program doesn't create the z****** file!

    Or at least this is what occured on my computer. I finished a test with n=4.97 million or so and then dl'ed a new one with n=5 million or so. Using Go Back 3 Deluxe, I was able to confrim that no z***** file was created for my new n! However, the program refuses to give up trying: even though I successfully expired the test (I hadn't read this thread yet and assumed it was just that test at first) via the "preferences" tab on the main page, the client will not get a new test (and remember, there is no z***** file!). As far as I know, if you expire a test and make sure the the z***** file does not exist, you should automatically acquire a new test!

    Therefore, I'm willing to guess that the limitation is the 1.5 million digits thing, and that this limitation is causing a problem with creating a z****** file for n>4.98 million or so.

    Good luck figuring out how to fix this....I'm clueless. I'll just try changing my username to [removed by alien88] for now.

    Oh yeah I'm running p4 2.8c with kington hyperx 3200 winxp pro with a 10,000 RPM HD and the service install of the client in -o2 mode.
    Last edited by Alien88; 11-02-2003 at 03:51 AM.

  31. #31
    Member
    Join Date
    Jan 2003
    Location
    Germany
    Posts
    36
    Did anyone try to use an older version??

    I once tested the 31337 account with an P4 and it worked just fine. That was when v1.1 wasn't out yet.

  32. #32
    Senior Member eatmadustch's Avatar
    Join Date
    Nov 2002
    Location
    Switzerland
    Posts
    154
    my test actually switched off my computer!!
    here's the log:
    Code:
    [Sat Nov 01 04:48:03 2003] n.high = 4963519  .  1 blocks left in test 
    [Sat Nov 01 04:55:41 2003] residue: 7F2449D014FB3C22
    [Sat Nov 01 04:55:41 2003] completed proth test(k=33661, n=4972896): result 3  <--damn, no prime ;)
    [Sat Nov 01 04:55:41 2003] connecting to server
    [Sat Nov 01 04:55:42 2003] logging into server
    [Sat Nov 01 04:55:43 2003] requesting a block
    [Sat Nov 01 04:55:48 2003] got proth test from server (k=21181, n=5007260)
    
    *** Sat Nov 01 10:00 wake up, realize computer isn't working, switch it on***
    
    [Sat Nov 01 11:18:22 2003] got k and n from cache
    
    *** start client again, crashes again ...
    
    [Sat Nov 01 11:24:34 2003] got k and n from cache
    I would really like a fix for this, now I only have my slow athlon!
    EatMaDust


    Stop Microsoft turning into Big Brother!
    http://www.againsttcpa.com

  33. #33
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    would be nice to see a reply from Louie or someone with access to the code that they are aware of the problem and are looking at it
    I have heard from Louie. He is on the case.

  34. #34
    Hater of webboards
    Join Date
    Feb 2003
    Location
    København, Denmark
    Posts
    205
    Originally posted by allio
    Have we determined if the crashing is isolated to NT/2k/XP yet? It seems that it's definitely only SSE2 enabled cpus (ie, the p4 and opteron) that are having trouble.
    The "new" client (v1.10) hasn't been ported to anything but windows yet, and I don't think anyone ever made v1.02 run under Linux on a P4 (it was discussed in several threads i February/March). I once managed to make an even older client (I think it was v1.00) run under Linux on a P4, but I have deleted that, and am now using that machine for P-1 factoring.

    But the problem seems very familiar to the problem with SBfactor that only occur under windows.

  35. #35
    Guys, please try to see if you have .z****** files for these exponents. I had THREE different tests greater than 4.98 million on my computer and NONE of them successfully created a .z***** file. If this is also occurring on anyone else's computer, I'm pretty sure that it is what is causing the problem (or, at least, the problem is occuring before the cache file is created, and a far as I know, the process goes like this: 1) You get a test from the server, 2) You create a cache (.z******) file for the test 3) You actually start testing. Obviously the problem is not in 1), so 2) is the next logical conclusion).

    In case you don't know, the cache file for your tests is a .z******* file that is saved in the directory where you installed SB (or, equivalently, sobsvc). The ******* is the exponent (n-value)....for example, if you were testing 5353*2^5009190+1, your .z******* file would be called .z5009190. Therefore, if you were assigned a test with n>4.98 million, PLEASE check whether or not you have a .z498**** file (or a .z499**** file or a .z500**** file) on your computer and then post about it here!
    Last edited by [EGBT]ComOy; 11-01-2003 at 03:36 PM.

  36. #36
    Senior Member eatmadustch's Avatar
    Join Date
    Nov 2002
    Location
    Switzerland
    Posts
    154
    My client didn't create the z****** files either
    EatMaDust


    Stop Microsoft turning into Big Brother!
    http://www.againsttcpa.com

  37. #37
    Hater of webboards
    Join Date
    Feb 2003
    Location
    København, Denmark
    Posts
    205
    The z******* files aren't made until the test has run for around 10 minutes.

  38. #38
    Wow, I never noticed that before. Well, that means instead of the cache files being the problem, it probably has to do with the communication between the cache and the client (since the tests are obviously cached as they won't go away). But that pretty much leads us right back to square one......


  39. #39
    I have been aware of this problem for a couple days.

    The issue only effects processors that use SSE2 (ie P4, Opterons(?)) for exponents n > 498000.

    A fix is ready. Download this new version. It will be posted on regular mirrors soon but here it is now:
    SB v1.1.1
    http://www.seventeenorbust.com/download/

    Cheers,
    Louie
    Last edited by Alien88; 11-02-2003 at 02:15 AM.

  40. #40
    woooo im an idiot and confused two usernames.. maybe its time for sleep :P

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •