Results 1 to 20 of 20

Thread: Sob.dat file with 10 k etc. (following 28433=Prime)

  1. #1
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479

    Sob.dat file with 10 k etc. (following 28433=Prime)

    The sob.dat files have been updated to remove k=28433, leaving only 10 k.

    Only the daily updated files are now available.

    The sob.dat for sieving file zipped (0.5MB).
    A clear text version of the same (1MB), just for info.
    The very small zip file that has clear text .dat files that indicate the next couple of days of main and DC PRP efforts, and a log file that indicates the daily progress of the shrinkage of the sob.dat file, again just for info.

    A sob.dat for P-1 factoring ONLY with the next 0.5M main candidates (becomes out of date if not updated every ~30 days).

    All time scoring and 2005 scoring have been adjusted to reflect the newly found prime. Any factors for k-28433 submitted before today will now have their score frozen, any submitted from now onwards will score zero. I expect that Louie or Dave will stop accepting factors for this k on the submission form soon.

    I've tested the new sob.dat file for sieving on a large sample of one PC, and I see a 4% speed improvement. I hope others see better improvements.

    Over the next couple of days I'll be changing other aspects of the scoring pages, and I'll also be re-organising the sieving web site a little, sorry in advance if some pages become inaccessible (the .dat files above won't be moving).

  2. #2
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Mike,

    Is it just me or has the link to this dat not been updated, I still downloaded the 11k dat same size etc???

    Humm, created my own... looks like this wasn't a very demanding k... Looks like about a 5% speed increase. Still very good and I can't say I'm dissapointed in any respect.

    Correct me if I'm wrong but the new 10k dat should be around
    2.77mb (2,908,006 bytes) I'll download the 10K again and see what happens.



  3. #3
    Sieve it, baby!
    Join Date
    Nov 2002
    Location
    Potsdam, Germany
    Posts
    959
    http://www.aooq73.dsl.pipex.com/sobd...at_n1M-20M.zip gave me a dat file which at least says "10" in the beginning...

  4. #4
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Humm,

    It worked this time, possible internet caching issue??? I deleted my temp files etc, looks like it's/was correct...

  5. #5
    Forgotten Member
    Join Date
    Dec 2003
    Location
    US
    Posts
    64
    Im seeing a slight speed increase, I am sieveing close to 2^50 currently

    Old dat
    pmin=1120143710441429 @ 586 kp/s
    pmin=1120143720441463 @ 577 kp/s
    pmin=1120143730441481 @ 578 kp/s
    New dat
    pmin=1120143740441483 @ 614 kp/s
    pmin=1120143750441491 @ 607 kp/s
    pmin=1120143760441523 @ 612 kp/s

  6. #6
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    This is inline with a 4.5% speed increase which is what I'm seeing across my machines.

    Unfortunate in a way,

    11/10 = 1.1 = 110% so a 10% speed increase would be average.

    I think if we eliminate 67607 we get something like a 14% speed increase.

    Does anyone know how the k's are tied together?? I though 67607 is somehow different etc, I guess 28433 was tied to another k, (shared computational variable with another k) was is 55459???

  7. #7
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    First batch of changes to the score pages are complete. Key change being that I've moved away from arbitrary n and p boundaries for breaking down results, and have instead moved to categorising by number of PRP tests saved.

    I'll update main page to try to give an explanation of what is being shown, but I think you’ll figure most of it out.

    Changes apply to all time and 2005.

    ...one thing we can see from the new data - number of main PRP tests saved from factors found this year = 1. Number of DC PRP tests saved from factors found this year = 0.

  8. #8
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    Wonderful!! And very useful. But, may be not very easy to understand for all the users.

    Mike, could you please how the logic works when you find some time? Thx.

    For example, should we read the

    Total: 120775(202987) 6219( 18512) 54091

    figures like...

    Of all of the factors found so far;

    120775 came before first-pass PRP tests, and PRP level is now above their n level.

    202987 came before first-pass PRP tests, and PRP has not yet reached that level. So, it's not guaranteed that they will save tests (i.e. if a prime is found)

    6219 came after first-pass PRP tests, but before second-pass PRP tests, and second-pass is now above their n level.

    18512 came after first-pass PRP tests, but before second-pass PRP tests, and second-pass not yet reached that level. So, it's not guaranteed that they will save tests (i.e. if a prime is found by second-pass PRP)

    So far, my reasoning looks like ok to me.. But, hey what about 54091? AFAIK, these figures are for unique factors only. So, why would 3513 n=19m-20m candidate not save tests? The figures seem reasonable up to n=5m, but hey, what's going on after that???

    There's definately something (if not all) that I'm misinterpreting..

  9. #9
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    120775 came before first-pass PRP tests, and PRP level is now above their n level.
    Correct.

    202987 came before first-pass PRP tests, and PRP has not yet reached that level. So, it's not guaranteed that they will save tests (i.e. if a prime is found)
    Correct.

    6219 came after first-pass PRP tests, but before second-pass PRP tests, and second-pass is now above their n level.
    Correct.

    18512 came after first-pass PRP tests, but before second-pass PRP tests, and second-pass not yet reached that level. So, it's not guaranteed that they will save tests (i.e. if a prime is found by second-pass PRP)
    Correct. See I said you figure most of it out.

    So far, my reasoning looks like ok to me.. But, hey what about 54091? AFAIK, these figures are for unique factors only. So, why would 3513 n=19m-20m candidate not save tests? The figures seem reasonable up to n=5m, but hey, what's going on after that???
    These are a combination of factors where two tests had already been performed when the factor was found (this will be the makeup of the smaller n bands) and factors that have become useless because a prime has been found. For k=5359, that's any first timer's n>5272167 and DC's n>400000. For k=28433, that's n>7882623 and n>1255752 respectively. The higher n bands '0 test saved' are made-up exclusively of these "factors made useless by prime" type.

    This is intended to show one of the negative aspects of sieving - some factors will be made redundant when primes are found. But the really positive thing that can be seen from this table is that >16689 tests will be saved in 8M<n<9M. This is more than the 16644 for 3M<n<4M, even though two primes have been found in the meantime.

    The most important thing on that page is the "candidates remaining" (unchanged), but you need to compare that with and old version of the page to see that it's going down, which isn't good.

    I'm now thinking that maybe I should add an "estimated number of PRP tests performed" for each n band. That would then make it really clear that the number of PRP tests per n slice keeps coming down as time goes by, and that something that wasn't obvious before, and still isn't obvious now. More work in progress me thinks...

  10. #10
    Originally posted by MikeH
    ...one thing we can see from the new data - number of main PRP tests saved from factors found this year = 1. Number of DC PRP tests saved from factors found this year = 0.
    Looks like I am the one that is credited with saving the first test of 2005.
    http://www.aooq73.dsl.pipex.com/2005/scores.htm
    (if I understand the stats correctly)

    Still, to prove Nuri's point that a better explanation of the stats for us sieving n00bs is needed: I don't really understand what the difference is between the "main PRP" and "DC PRP" tests you mention here. How come we have saved a main test without also saving a double check?
    Also, the factor that supposedly saved one test (k=33661, n=7596144) in the main stats is also listed as a "possible ongoing test". Have we saved something at all here-or it came too late?
    It is also listed as 2 tests saved in my personal stats page-shows a 2 without ()
    http://www.aooq73.dsl.pipex.com/2005/ui/1792.htm
    Something I am missing here?
    Maybe a detailed idiot-proof explanation of the figures is necessary-not all of us are math genuises here! (I will be the first to admit I'm stupid or something...)

    Finally, a slightly off-topic stats related question. My personal stats page shows only the range 490100-491000 as reserved by me and incomplete. In fact, I have completed this almost a month ago, did 545-546 afterwards and now I'm working on 599-600. Why are these shown as non-reserved ranges?

    Mike, I do not intend to nag at all, your work is great! Just wanted to show that some aspects of the sieving are still not for the average user and will require quite some clarification if we are to get more users involved in it...
    Thanks for your great effort anyway

  11. #11
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    It looks really good now Mike and I can see where your going with it.

    The standard number (+number) are tests total (that day) I would assume.

    I'm still not understanding the n>8m as tests saved, zero?

    I'd personally say this just makes things confusing and get rid of it, since these numbers will go to zero once everyone uses the 10k dat.

    I guess it has to do with the finding of a prime makes these pages confusing.

    Rather than this I would use...

    A seperate entry similar to the 90% sieve point, somehow represent % of sieve factors wasted. Some sort of, = 100% x Factors found above k/n prime / total factors found.

    This would only have to be updated every prime and would explain alot more, also remeber we only recieved a 5% speed increase from removing this k from sieve not 9.09%.

    I like what you have done thus far alot

  12. #12
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Thanks for the comments mad dog,

    I'll clear up a few things.

    Currently if you find a factor above prp (good to see you picked up some terms ) you do infact eliminate two tests. IF it's below prp then only one.

    PRP is the testing 90+% of the users do with the 2.3 client. Main PRP or first time PRP is just that never tested before, DC PRP is the Double Check PRP where we are testing the same k/n pairs again to look for errors.

    main PRP is at ~ n=7,900,000 where DC PRP is somewhere around n=1,300,000

    Once the k/n is double check and the results match this test will never be tested again. If they don't match then a third and possibly 4th test are done.

    Also your picking up on things and starting to ask the big questions...
    "possible ongoing test" (k=33661, n=7596144)

    This basically means it's within a region were someone may still be testing that number as a prime in vein b/c you recently found a factor. WE have no way of telling this person hey stop your test. The only thing we can do is post it there. Had we/you found that factor before prp=7596144 the server never would have assigned it.

    Mike will probably update everyone stats for ranges once all the havoc is done and he's finished the updates. Ranges are the only thing that is still manual.

  13. #13
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    Originally posted by MikeH
    These are a combination of factors where two tests had already been performed when the factor was found (this will be the makeup of the smaller n bands) and factors that have become useless because a prime has been found. For k=5359, that's any first timer's n>5272167 and DC's n>400000. For k=28433, that's n>7882623 and n>1255752 respectively. The higher n bands '0 test saved' are made-up exclusively of these "factors made useless by prime" type.

    This is intended to show one of the negative aspects of sieving - some factors will be made redundant when primes are found. But the really positive thing that can be seen from this table is that >16689 tests will be saved in 8M<n<9M. This is more than the 16644 for 3M<n<4M, even though two primes have been found in the meantime.

    The most important thing on that page is the "candidates remaining" (unchanged), but you need to compare that with and old version of the page to see that it's going down, which isn't good.

    I'm now thinking that maybe I should add an "estimated number of PRP tests performed" for each n band. That would then make it really clear that the number of PRP tests per n slice keeps coming down as time goes by, and that something that wasn't obvious before, and still isn't obvious now. More work in progress me thinks...
    How about using two seperate colums for 0? (i.e. 0i for "factors where two tests had already been performed when the factor was found" and 0ii for "factors that have become useless because a prime has been found") with a short explanation of 0i and 0ii as a footnote of the table of course.

    I know, as we've started using n min adjusted sob.dat, 0i figures will not change much. Only three sources of increase comes to my mind, i) users that seldomly update their sob.dats, ii) users that purposely search for and submit factors those k/n pairs, and iii) users that dump their out of range factors.

    Still, on the broader perspective, it would be interesting to observe one going down, and one going up as n million bands increase, especially if we find a few more primes on the way.

    Originally posted by MikeH
    I'm now thinking that maybe I should add an "estimated number of PRP tests performed" for each n band. That would then make it really clear that the number of PRP tests per n slice keeps coming down as time goes by, and that something that wasn't obvious before, and still isn't obvious now.
    Why not use actual figures instead of estimated? I'm sure Louie will be able to grab it from the database, and I guess he'll be willing to provide this data this data to you in the form you would use from a link to you every 6 hours. Of course, the resulting figures will have some side effects (like that comes from dropped tests, or from the fact that we dumped residues of previous reasearchers once oupn a time to the database, etc.) that might mislead the end stats fanatic and will include parameters that are beyond the scope of the sieve, but it might still be interesting.
    Last edited by Nuri; 01-04-2005 at 06:45 PM.

  14. #14
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    One other suggestion to make understanding of the table easier for the average user is to add three subtotal lines just above Total row.

    These lines should indicate;

    - subtotal of factors below DC active n level (n<1259597 currently),
    - subtotal of factors in between (1259597<n<7899646 currently), and
    - subtotal of factors above main active n leve (7899646<n<20000000).

    This will also make it posssible to follow up with how many first time PRP tests are left to reach 20m. Or even better, one will easily assess the expected proportion of his new factors (i.e. what percent of new factors will save two tests, and what percent will save only one), especially if he updates his sob.dat on a regular basis. Currently, 532336 figure shown at the table for candidates remaining does not make much sense, as it's almost guaranteed that some of the candidates (i.e. those below DC n treshold) will remain there forever.

    Just a thought.

  15. #15
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    Originally posted by vjs I'm still not understanding the n>8m as tests saved, zero?
    They will be saved once when PRP first-pass n level passes them. Until then, they are candidates for PRP tests will saved twice (2).

    When PRP first-pass n level passes them, they will move to PRP tests saved twice.

    Or, am I understanding the question wrong?

  16. #16
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Nuri,

    This question was based upon something Mike already changed, or something I saw differently earlier, but thanks. I think Mike's pages look great and add alot... of course they are complicated for the first timer. Even upon initial examination, but what are we expecting from the page? I think the required level of understand for this page is high, but that what this page is for...

    Other pages like the scores page are what newer people will focus on at first. Then they will try to understand the details, there is something for everyone now.

  17. #17
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Hey Mike your all users pages doesn't work did you take it down.

    The http://www.aooq73.dsl.pipex.com/ui/9999.htm

  18. #18
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    Hey Mike your all users pages doesn't work did you take it down.
    It's now http://www.aooq73.dsl.pipex.com/ui/19999.htm. If you use the link which is the 'total' at the bottom of the user score page, that should always work.

  19. #19
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Thanks Mike , this is one of my most favorate pages, shows the project as a whole, who is doing what etc.


  20. #20
    Senior Member Frodo42's Avatar
    Join Date
    Nov 2002
    Location
    Jutland, Denmark
    Posts
    299
    Originally posted by vjs
    Thanks Mike , this is one of my most favorate pages, shows the project as a whole, who is doing what etc.
    Wow ... I never saw that page ... that's surely also going to be one of my favorite's

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •