Results 1 to 18 of 18

Thread: Windows V2.2 Bug!!!

  1. #1
    Former QueueMaster Ken_g6[TA]'s Avatar
    Join Date
    Nov 2002
    Location
    Colorado
    Posts
    184

    Exclamation Windows V2.2 Bug!!!

    I believe I've found a bug in V2.2 that can cause it to miss primes!

    As i mentioned in another thread, I've been using the SB client to search for some primes on my own. Today I decided to test it just to make sure it could find some known primes. It was working fine, until I decided to stop it for some reason (by clicking exit, and not stop), and then continued the test a few minutes later. Here's the log file from my Athlon XP@1800MHz:
    [Tue Dec 21 13:16:48 2004][Tue Dec 21 13:16:48 2004] connecting to server
    [Tue Dec 21 13:16:48 2004] logging into server
    [Tue Dec 21 13:16:48 2004] requesting a block
    [Tue Dec 21 13:16:48 2004] got proth test from server (k=123, n=102929)
    [Tue Dec 21 13:16:48 2004] AMD Athlon(tm) XP 2200+ detected. Enabling cpu specific optimizations.
    [Tue Dec 21 13:19:44 2004] got k and n from cache
    [Tue Dec 21 13:19:44 2004] AMD Athlon(tm) XP 2200+ detected. Enabling cpu specific optimizations.
    [Tue Dec 21 13:19:44 2004] restarting proth test from cache (k=123, n=102929) [33.4%]
    [Tue Dec 21 13:20:05 2004] residue: D4C63EB08A36BB33
    I double checked with PRP and Proth, and 123*2^102929+1 is a prime number. So I re-tested on a Celeron 667MHz. When I let the test run straight through, it works fine. But when I interruped, it didn't! Here's the logfile:
    [Tue Dec 21 14:00:33 2004] got k and n from cache
    [Tue Dec 21 14:00:33 2004] Intel(R) Celeron(R) processor detected. Enabling cpu specific optimizations.
    [Tue Dec 21 14:02:42 2004] residue: 0000000000000000
    [Tue Dec 21 14:02:42 2004] completed proth test(k=123, n=102929): result 2
    [Tue Dec 21 14:03:59 2004] got k and n from cache
    [Tue Dec 21 14:03:59 2004] Intel(R) Celeron(R) processor detected. Enabling cpu specific optimizations.
    [Tue Dec 21 14:05:16 2004] got k and n from cache
    [Tue Dec 21 14:05:16 2004] Intel(R) Celeron(R) processor detected. Enabling cpu specific optimizations.
    [Tue Dec 21 14:05:16 2004] restarting proth test from cache (k=123, n=102929) [33.8%]
    [Tue Dec 21 14:06:32 2004] residue: 774B0C7AEDD87B9E
    [Tue Dec 21 14:06:32 2004] completed proth test(k=123, n=102929): result 3
    So does this mean all restarted tests we've done with V2.2 are wrong? :shocked: I'm afraid to even try earlier versions.
    Proud member of the friendliest team around, Team Anandtech!
    The Queue is dead! (Or not needed.) Long Live George Woltman!

  2. #2
    Moderator Joe O's Avatar
    Join Date
    Jul 2002
    Location
    West Milford, NJ
    Posts
    643
    Try running this same KN pair on the same machine but stopping it after a different amount of time has elapsed and restarting it. Do the residues match?
    Joe O

  3. #3
    omg - if this is true (for all client versions) - it would be a desaster

  4. #4
    Former QueueMaster Ken_g6[TA]'s Avatar
    Join Date
    Nov 2002
    Location
    Colorado
    Posts
    184
    Good ideas for tests.

    The restart point does appear to matter. Though I didn't try restarting at the same exact point. That would be tough.
    [Tue Dec 21 21:27:05 2004] restarting proth test from cache (k=123, n=102929) [50.4%]
    [Tue Dec 21 21:28:02 2004] residue: 48FBE3DF8FB4D0F3
    [Tue Dec 21 21:28:02 2004] completed proth test(k=123, n=102929): result 3
    Now for the good news : Version 1.2.5 doesn't seem affected! The following logs are from a reinstall of 1.2.5:
    [Tue Dec 21 21:28:07 2004] Intel(R) Celeron(R) processor detected. Enabling cpu specific optimizations.
    [Tue Dec 21 21:31:37 2004] got k and n from cache
    [Tue Dec 21 21:31:37 2004] Intel(R) Celeron(R) processor detected. Enabling cpu specific optimizations.
    [Tue Dec 21 21:35:31 2004] residue: 0000000000000000
    [Tue Dec 21 21:35:31 2004] completed proth test(k=123, n=102929): result 2
    [Tue Dec 21 21:36:48 2004] got k and n from cache
    [Tue Dec 21 21:36:48 2004] Intel(R) Celeron(R) processor detected. Enabling cpu specific optimizations.
    [Tue Dec 21 21:40:54 2004] got k and n from cache
    [Tue Dec 21 21:40:54 2004] Intel(R) Celeron(R) processor detected. Enabling cpu specific optimizations.
    [Tue Dec 21 21:40:54 2004] restarting proth test from cache (k=123, n=102929) [50.0%]
    [Tue Dec 21 21:42:54 2004] residue: 0000000000000000
    [Tue Dec 21 21:42:54 2004] completed proth test(k=123, n=102929): result 2
    I can't test version 2, as I have neither a P4 nor a copy of it.
    Proud member of the friendliest team around, Team Anandtech!
    The Queue is dead! (Or not needed.) Long Live George Woltman!

  5. #5
    Originally posted by Ken_g6[TA]

    I can't test version 2, as I have neither a P4 nor a copy of it.
    This is with version 2.0
    Code:
    [Wed Dec 22 07:29:41 2004] got k and n from cache
    [Wed Dec 22 07:29:41 2004] Intel(R) Pentium(R) 4 CPU 2.80GHz detected.  Enabling cpu specific optimizations.
    [Wed Dec 22 07:29:51 2004] got k and n from cache
    [Wed Dec 22 07:29:51 2004] Intel(R) Pentium(R) 4 CPU 2.80GHz detected.  Enabling cpu specific optimizations.
    [Wed Dec 22 07:29:51 2004] restarting proth test from cache (k=123, n=102929) [25.7%]
    [Wed Dec 22 07:30:04 2004] residue: 0000000000000000
    [Wed Dec 22 07:30:04 2004] completed proth test(k=123, n=102929): result 2
    Here the same with v2.2
    Code:
    [Wed Dec 22 07:33:49 2004] Intel(R) Pentium(R) 4 CPU 2.80GHz detected.  Enabling cpu specific optimizations.
    [Wed Dec 22 07:34:01 2004] got k and n from cache
    [Wed Dec 22 07:34:01 2004] Intel(R) Pentium(R) 4 CPU 2.80GHz detected.  Enabling cpu specific optimizations.
    [Wed Dec 22 07:34:01 2004] restarting proth test from cache (k=123, n=102929) [32.3%]
    [Wed Dec 22 07:34:13 2004] residue: 0000000000000000
    [Wed Dec 22 07:34:13 2004] completed proth test(k=123, n=102929): result 2
    And here with v2.2 and Serv.handl. 1.6:
    Code:
    [Wed Dec 22 07:36:13 2004] got k and n from cache
    [Wed Dec 22 07:36:13 2004] Intel(R) Pentium(R) 4 CPU 2.80GHz detected.  Enabling cpu specific optimizations.
    [Wed Dec 22 07:36:23 2004] got k and n from cache
    [Wed Dec 22 07:36:23 2004] Intel(R) Pentium(R) 4 CPU 2.80GHz detected.  Enabling cpu specific optimizations.
    [Wed Dec 22 07:36:23 2004] restarting proth test from cache (k=123, n=102929) [48.7%]
    [Wed Dec 22 07:36:32 2004] residue: 0000000000000000
    [Wed Dec 22 07:36:32 2004] completed proth test(k=123, n=102929): result 2
    No problem with the client for P4.


    Code:
    [Wed Dec 22 08:09:05 2004] Intel(R) Pentium(R) III processor detected.  Enabling cpu specific optimizations.
    [Wed Dec 22 08:09:26 2004] got k and n from cache
    [Wed Dec 22 08:09:26 2004] Intel(R) Pentium(R) III processor detected.  Enabling cpu specific optimizations.
    [Wed Dec 22 08:09:26 2004] restarting proth test from cache (k=123, n=102929) [22.2%]
    [Wed Dec 22 08:10:27 2004] residue: 206A9736E900993B
    [Wed Dec 22 08:10:27 2004] completed proth test(k=123, n=102929): result 3
    Optimization for non-SSE2 procs seems to be broken!!!
    Last edited by Joh14vers6; 12-22-2004 at 02:15 AM.

  6. #6
    Senior Member
    Join Date
    Apr 2004
    Location
    Florianopolis - Santa Catarina - Brazil
    Posts
    114
    It looks like a serious problem. I hope it get fixed soon.

  7. #7
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    May I sugest that all people run either "garbage" or "supersecret" for a user name for a few days until this issue gets sorted out.

    Bascially these accounts will have their k/n pair residues checked quicker so we don't miss a prime. THis is a terrible time to have errors we are statistically very close to a new prime.

  8. #8
    Originally posted by vjs
    May I sugest that all people run either "garbage" or "supersecret" for a user name for a few days until this issue gets sorted out.

    Bascially these accounts will have their k/n pair residues checked quicker so we don't miss a prime. THis is a terrible time to have errors we are statistically very close to a new prime.
    I have tested the P4 as you see and that is allright. I have no AMD64 or Opteron to test with. I suggest to stop the client for all non-SSE2 procs, reinstall v1.2.5, put in registry the key
    Code:
    [HKEY_LOCAL_MACHINE\SOFTWARE\LhDn\sob\cache]
    "cache"=dword:00000000
    and start the client again. The client will fetch another test. Delete the old test cache-file zxxxxxxx and the test in the pendingtests site http://www.seventeenorbust.com/accou...sPending.mhtml.
    Last edited by Joh14vers6; 12-23-2004 at 10:27 AM.

  9. #9
    Senior Member Frodo42's Avatar
    Join Date
    Nov 2002
    Location
    Jutland, Denmark
    Posts
    299
    How about testing on some of the bigger numbers where SOB have found primes ...

    This is really bad if this means we can miss primes ... has anyone been in contact with Louie, David or anyone else with access to the source code about this problem?

  10. #10
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    I have a feeling that the client is just fine, it has been tested fully in beta and was a long time coming.

    I have feelings those who are having problems are either having problems with their machine or the install.

    If your really having problems...

    1. Wait until the old test finishes with the old client, or simply delete the zfile.

    2. Unistall the resevice, remove the program fully from the system.

    3. delete the folder

    4. Restart

    5. Reinstall the new client into with the exact same directory structure as the old client.

    If it still has problem it's your computer!!! The new client is X% faster doesn't that mean that it works different parts of the system or the same parts X% faster. Your on the verge of being flakey ram is now causing problem etc.

    The major reason why I'm such an advocate of supersecret is it should show problems almost imediately. On an indivual basis based upon IP.

    I wish the project moderators would force or at least strongly suggest supersecret for 24 hours, or possibly add 50 supersecret tests to the main que on a daily basis. Also post or have a daily pull showing mismatched resides vs IP address should be pretty simple to do. But lets face it, this is some work and they are busy doing the great job that they currently are.

    I'm going to start a main thread on this last paragraph.

  11. #11
    Former QueueMaster Ken_g6[TA]'s Avatar
    Join Date
    Nov 2002
    Location
    Colorado
    Posts
    184
    None of my CPUs are overclocked, and I highly doubt they're at fault. This is a perfectly reproducible bug. Here are the results when I tried the same procedure on a larger prime, the first one Seventeen or Bust found.

    First a straight-through test:
    [Wed Dec 22 09:35:41 2004] got k and n from cache
    [Wed Dec 22 09:35:41 2004] AMD Athlon(tm) XP 2200+ detected. Enabling cpu specific optimizations.
    [Wed Dec 22 10:27:19 2004] residue: 0000000000000000
    [Wed Dec 22 10:27:19 2004] completed proth test(k=46157, n=698207): result 2
    Then a test I stopped for awhile:
    [Wed Dec 22 11:01:19 2004] got k and n from cache
    [Wed Dec 22 11:01:19 2004] AMD Athlon(tm) XP 2200+ detected. Enabling cpu specific optimizations.
    [Wed Dec 22 16:14:46 2004] got k and n from cache
    [Wed Dec 22 16:14:46 2004] AMD Athlon(tm) XP 2200+ detected. Enabling cpu specific optimizations.
    [Wed Dec 22 16:14:47 2004] restarting proth test from cache (k=46157, n=698207) [3.8%]
    [Wed Dec 22 16:50:44 2004] resolving hostname
    [Wed Dec 22 16:50:44 2004] opening connection
    [Wed Dec 22 16:50:48 2004] logging into server
    [Wed Dec 22 16:50:48 2004] login successful
    [Wed Dec 22 16:50:48 2004] n.high = 512827 . 1 blocks left in test
    [Wed Dec 22 17:04:52 2004] residue: 012AE741F6893700
    [Wed Dec 22 17:04:52 2004] completed proth test(k=46157, n=698207): result 3
    Here's something striking. I tried running the same smaller prime on a Linux machine (PII-400). It followed the same pattern, even though the test restarted from 0.0%!

    Once again, the straight-through:
    [Wed Dec 22 13:12:59 2004] client process [v2.2] invoked
    [Wed Dec 22 13:12:59 2004] priority set to idle
    [Wed Dec 22 13:12:59 2004] connecting to server
    [Wed Dec 22 13:12:59 2004] logging into server
    [Wed Dec 22 13:12:59 2004] requesting a block
    [Wed Dec 22 13:12:59 2004] got proth test from server (k=123, n=102929)
    [Wed Dec 22 13:12:59 2004] server packet cached to disk
    [Wed Dec 22 13:12:59 2004] Intel Pentium II or Pentium II Xeon processor detected. Enabling cpu specific optimizations.
    [Wed Dec 22 13:13:18 2004] iteration: 10000/102936 (9.71%) k = 123 n = 102929
    [Wed Dec 22 13:13:37 2004] iteration: 20000/102936 (19.43%) k = 123 n = 102929
    [Wed Dec 22 13:13:56 2004] iteration: 30000/102936 (29.14%) k = 123 n = 102929
    [Wed Dec 22 13:14:15 2004] iteration: 40000/102936 (38.86%) k = 123 n = 102929
    [Wed Dec 22 13:14:33 2004] iteration: 50000/102936 (48.57%) k = 123 n = 102929
    [Wed Dec 22 13:14:52 2004] iteration: 60000/102936 (58.29%) k = 123 n = 102929
    [Wed Dec 22 13:15:11 2004] iteration: 70000/102936 (68.00%) k = 123 n = 102929
    [Wed Dec 22 13:15:30 2004] iteration: 80000/102936 (77.72%) k = 123 n = 102929
    [Wed Dec 22 13:15:48 2004] iteration: 90000/102936 (87.43%) k = 123 n = 102929
    [Wed Dec 22 13:16:07 2004] iteration: 100000/102936 (97.15%) k = 123 n = 102929
    [Wed Dec 22 13:16:13 2004] residue: 0000000000000000
    [Wed Dec 22 13:16:13 2004] completed proth test(k=123, n=102929): result 2
    And the paused:
    [Wed Dec 22 13:18:35 2004] client process [v2.2] invoked
    [Wed Dec 22 13:18:35 2004] priority set to idle
    [Wed Dec 22 13:18:35 2004] got k and n from cache
    [Wed Dec 22 13:18:35 2004] Intel Pentium II or Pentium II Xeon processor detected. Enabling cpu specific optimizations.
    [Wed Dec 22 13:18:54 2004] iteration: 10000/102936 (9.71%) k = 123 n = 102929
    [Wed Dec 22 13:19:13 2004] iteration: 20000/102936 (19.43%) k = 123 n = 102929
    [Wed Dec 22 13:19:32 2004] iteration: 30000/102936 (29.14%) k = 123 n = 102929
    [Wed Dec 22 13:19:51 2004] iteration: 40000/102936 (38.86%) k = 123 n = 102929
    [Wed Dec 22 13:20:09 2004] iteration: 50000/102936 (48.57%) k = 123 n = 102929
    [Wed Dec 22 13:20:28 2004] iteration: 60000/102936 (58.29%) k = 123 n = 102929
    [Wed Dec 22 13:20:47 2004] iteration: 70000/102936 (68.00%) k = 123 n = 102929
    [Wed Dec 22 13:21:06 2004] iteration: 80000/102936 (77.72%) k = 123 n = 102929
    [Wed Dec 22 13:21:24 2004] iteration: 90000/102936 (87.43%) k = 123 n = 102929
    [Wed Dec 22 13:21:36 2004] client process [v2.2] invoked
    [Wed Dec 22 13:21:36 2004] priority set to idle
    [Wed Dec 22 13:21:36 2004] got k and n from cache
    [Wed Dec 22 13:21:36 2004] Intel Pentium II or Pentium II Xeon processor detected. Enabling cpu specific optimizations.
    [Wed Dec 22 13:21:36 2004] restarting proth test from cache (k=123, n=102929) [0.0%]
    [Wed Dec 22 13:21:55 2004] iteration: 10000/102936 (9.71%) k = 123 n = 102929
    [Wed Dec 22 13:22:14 2004] iteration: 20000/102936 (19.43%) k = 123 n = 102929
    [Wed Dec 22 13:22:33 2004] iteration: 30000/102936 (29.14%) k = 123 n = 102929
    [Wed Dec 22 13:22:51 2004] iteration: 40000/102936 (38.86%) k = 123 n = 102929
    [Wed Dec 22 13:23:10 2004] iteration: 50000/102936 (48.57%) k = 123 n = 102929
    [Wed Dec 22 13:23:29 2004] iteration: 60000/102936 (58.29%) k = 123 n = 102929
    [Wed Dec 22 13:23:48 2004] iteration: 70000/102936 (68.00%) k = 123 n = 102929
    [Wed Dec 22 13:24:06 2004] iteration: 80000/102936 (77.72%) k = 123 n = 102929
    [Wed Dec 22 13:24:25 2004] iteration: 90000/102936 (87.43%) k = 123 n = 102929
    [Wed Dec 22 13:24:44 2004] iteration: 100000/102936 (97.15%) k = 123 n = 102929
    [Wed Dec 22 13:24:50 2004] residue: 5AF2408C8CAD1848
    [Wed Dec 22 13:24:50 2004] completed proth test(k=123, n=102929): result 3
    And just to show you my CPU's not flaky, I'm running 100 small primes straight through (not stopping, so I can use V2.2 and be sure it will be correct.) I'll attach a link to the logfile when I'm done.

    As promised, the logfile. It only did 99, because I had a minor communications glitch in the middle, but all 99 were re-proven pseudoprime.

    Now I'll have to try and figure out a way to automatically stop the program every minute or so, to check the bug on all 100.
    Last edited by Ken_g6[TA]; 12-22-2004 at 11:23 PM.
    Proud member of the friendliest team around, Team Anandtech!
    The Queue is dead! (Or not needed.) Long Live George Woltman!

  12. #12
    Thanks for finding this.. Louie has been informed and is working on a fix. At that point what we will probably do is put all the v2.2 tests back into the queue to be retested and then those that dont have a matching residue will be retested again.

    This is something that wasn't tested, and I honestly didn't even think about testing so I doubt Louie did. But.. it'll all be sorted out soon.

    --
    Mike

  13. #13
    Jedi Knight pumpkin0's Avatar
    Join Date
    Jan 2004
    Location
    Auckland, New Zealand
    Posts
    24
    Wow! Excellent detective work, Ken_g6[TA].

    I'm a little confused - am I correct in stating that only non-P4 CPUs with the 2.2 client are producing these erroneous results?

  14. #14
    Senior Member Frodo42's Avatar
    Join Date
    Nov 2002
    Location
    Jutland, Denmark
    Posts
    299
    Maybe it would be an idea that the V 2.2 announcement that says "everybody should upgrade" were removed for now until this is fixed.

  15. #15
    Moderator Joe O's Avatar
    Join Date
    Jul 2002
    Location
    West Milford, NJ
    Posts
    643
    Could someone who has reproduced the bug try this.
    Take a known composite and run it two ways. Once from start to finish without interruption. A second time pausing/stopping in the middle and then resuming.
    Are the residue results the same or different?
    Joe O

  16. #16
    Former QueueMaster Ken_g6[TA]'s Avatar
    Join Date
    Nov 2002
    Location
    Colorado
    Posts
    184
    As a matter of fact, Joe, when I discovered this bug I went back through the logs to find which of the numbers in the (small) range I was checking had been paused. There weren't many, so I re-did them before continuing the search last night.

    The only one whose original test I could find in my logfiles (one machine is off) was:

    Previously paused:
    [Thu Dec 16 11:52:07 2004] got proth test from server (k=334093, n=365250)
    [Thu Dec 16 11:52:07 2004] AMD Athlon(tm) XP 2200+ detected. Enabling cpu specific optimizations.
    [Thu Dec 16 11:52:55 2004] got k and n from cache
    [Thu Dec 16 11:52:55 2004] AMD Athlon(tm) XP 2200+ detected. Enabling cpu specific optimizations.
    [Thu Dec 16 11:52:55 2004] restarting proth test from cache (k=334093, n=365250) [13.8%]
    [Thu Dec 16 12:04:27 2004] residue: BD7585FCC4292CC1
    [Thu Dec 16 12:04:27 2004] completed proth test(k=334093, n=365250): result 3
    Straight through:
    [Wed Dec 22 23:18:14 2004] got proth test from server (k=334093, n=365250)
    [Wed Dec 22 23:18:14 2004] AMD Athlon(tm) XP 2200+ detected. Enabling cpu specific optimizations.
    [Wed Dec 22 23:31:12 2004] residue: AF9ECF23BDC26973
    [Wed Dec 22 23:31:12 2004] completed proth test(k=334093, n=365250): result 3
    Also in my Java program's logfile (which looks like PRP's because I designed it that way):
    Previously paused:
    [Mon Dec 20 14:09:56 MST 2004]
    358509*2^365250+1 is not prime. Res64: BA28869C45A7D71E

    Straight through:
    [Wed Dec 22 22:00:12 MST 2004]
    358509*2^365250+1 is not prime. Res64: 91F7B0BD22543C87

    Note: None of these residues match the residues from PRP 2.3. They may match the new residue calculation in PRP 3 if anyone with a P4 wants to try it.

    P.S. Thanks, Mike, for contacting Louie about this.
    Proud member of the friendliest team around, Team Anandtech!
    The Queue is dead! (Or not needed.) Long Live George Woltman!

  17. #17
    Originally posted by Ken_g6[TA]

    Also in my Java program's logfile (which looks like PRP's because I designed it that way):
    Previously paused:
    [Mon Dec 20 14:09:56 MST 2004]
    358509*2^365250+1 is not prime. Res64: BA28869C45A7D71E

    Straight through:
    [Wed Dec 22 22:00:12 MST 2004]
    358509*2^365250+1 is not prime. Res64: 91F7B0BD22543C87

    Note: None of these residues match the residues from PRP 2.3. They may match the new residue calculation in PRP 3 if anyone with a P4 wants to try it.

    P.S. Thanks, Mike, for contacting Louie about this.
    As you see in my earlier post, I can not reproduce the bug with a P4. I can only reproduce the bug with a P3. I have 6 P3's at the office, but right now the office is closed until 3 Jan 2005. I degrade all clients at the P3's to v1.2.5 and let them start with a new test.

  18. #18
    This has been fixed. Please see the new sticky post.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •