High n-range sieving [Archive]

vjs

11-24-2004, 10:38 AM

This is the start of the high range sieving thread.

What is proposed is to start higher n range sieving.

Facts:

1. There are more factors at lower p
2. A 1m<n<100m dat file is quite large and may not even work with cmov
3. Starting a new dat file from 1T wouldn't find additional factors below n=20m until we get to ~500T
4. Some people believe sieving above 20m right now is pointless.
5. Our factor density is decreasing, leading to difficulties in determined if a range has been sieved or not.
6. This project will continue past 20m if it wishes to finish
7. Sieving a 20m<n<80m range is a 60% the speed of sieving 1m<n<20m
8. This speed is probably represenative of a 1m<n<60m file
9. Eliminating k's with primes will speed up sieving
10. A new client that hasn't made it to beta yet may increase sieve speeds by 5x.
11. Factor files don't take up alot of space, if they do as a community we can find space for them.

It is my belief we should soon start sieving a larger range always including those n's above double check.

Example stop the [dc(double check)<n<20m] at some T say 600T, once we find a prime, or once we find a new client.

Continue from that point forward with a new *.dat (dd<n<some high T) I suggest 80m others believe 100m.

I don't think it's right to sieve 1-20 then dc-40 then dc-60 then dc-80 etc... It doesn't make sence. Widen the sieve range to the max we are going to do asap dc<n<100m, then continute sieving to very deep p.

Alot of factors will be removed from n=20m+ at low p this can be a second effort for those who wish to do it now, Or we could start the low p range once we get to n=18m.

Thought's comments...

We could also use the low p-range to test the new client.

Thanks to all

minbari

11-24-2004, 05:30 PM

It is my belief that the sieving effort is already fragmented to the point of being almost ineffective. The sole purpose of sieving is to eliminate canidates from the prp tests. All the low n ranges being sieved in the dc ranges should be reserved for the slowest machines as this area is not or should not be a high priority. Everything else available should be sieving, followed close by the P-1 effort and just ahead of the main prp testing.

Our focus should be on eliminating as many canidates as possible as fast as possible just ahead of the main prp test range.

Moving off into the distant future of prp testing (sieving in the >20m range) is IMO a waste of resources. It will be years before the main prp testing could hope to be in that area. By then there may well be processors that could prp test those numbers faster than we can now sieve them.

Then again, what do I know.

;)

vjs

11-24-2004, 06:44 PM

minbari,

Yes you are correct that sieving n's out past 20m is sort of useless at this point with cmov. Cmov is still good at factor production test elimination on a per machine basis and is still much better with sieve at eliminating k/n pairs then prp'ing with p4's and the new prp client.

However there is a possibility of a new sieve client in the works that could speed the sieve up greatly. Currently there isn't any reason to believe that it even works but in the mac sieve client discussion there is a link. That link directs you to the riesel forum where someone is creating a new client, if it works for sob.dat is still questionable....

If it works as reported the intial speed increase was from something like 200kps to 1800 kps. If this is the case, we could start sieving a 1m<n<100m dat file with a greater speed than current, that client is not as strongly dependant on n-range.

However if we do decide to start sieving a larger range, why start again from 0, after we decide to stop with 1m<n<20m. Why not start asap from where we are with 1m<n<100m if there is no appreciable speed decrease.

Well one major problem is the dat file would be very large >30mb. However sieving out the first 150G ( 000000-000150 ) or so will reduce the dat file to less than half it's size. Something more managable for users.

And how do we know if this client works with large n or not. We need a list of factors from the current client upto 100m over a decent range. Then check if the factors are correct and if either client misses anything, by sieving the same range with either client...

If this will happen who know's... but we have those factors when someone needs them.

But the point is let's not start yet but at some point, and let's be smart about our decision of when and how deep. Also let other know what we are doing and post our findings.

<joke>
If we had an insane fast sieve client 1000's of Yp/s we could probably complete all k/n pairs upto 100m within a few months, we would just have to sieve all p values to p=2^50m. :p

minbari

11-24-2004, 07:06 PM

Yes, I am aware of the "discussion/development" of the new sieve client. I truely would like to believe it will be as fast as discribed, however I am reserving judgement unitl I see it in action and the results can be verified. I have run several sieving clients (proth_sieve, NewPGen, ksieve..ect) on alot of different projects, some public, some not and have never seen anything close to the speeds reported, if true it will be one heck of an improvement!

On a differnt note, will the current client even be able to test that high. I know at one time the GIMPS (Prime95) client was limited to something like 74-79M exponents.

Keroberts1

11-24-2004, 08:40 PM

I believe taht the huge speed improvments are stated for 64 bit processors only although some improvements will be seen for all around. However, in a few years time 64 bits wil be the standard. Also with a signifigantly deeper sieve depth we will als obe eliminating a good portion of the tests that would have to be PRP'd well into the future. There is no reason not to save the efforts not if we wait and do the extra work later we'll be having to redo all of our sieve efforts. Why not just do them both at once and go twice as deep into the P ranges. In due time we will reach depths far beyond 20,000,000 wiht the PrP and even if 6 primes are found before 100,000,000 then 70% of the factors we find will still save at least one test.

ceselb

11-24-2004, 10:02 PM

not only 64 bits. numbers stated are for 32 bits. (benchmarked with a very rough alpha test suite, 8x faster on my P4-1.5)

Keroberts1

11-25-2004, 03:19 AM

geesh gimme gimme gimme

Nuri

11-25-2004, 07:09 AM

Looks promising....

I'm excited to see the new client at work and try it with a couple of range alternatives.

I guess it would be wise to decide on what to do with some data on hand, i.e. after seeing performance results at various n range alternatives.

If that new client fails to come soon enough, I think it would be wise to wait until our next prime for a major change. Yes, we have estimates for when the next prime will come, but who knows really when...

In case, for example, we have no new sieve client and no prime for the next two years, and sieve builds up to the point where it becomes pointless - or has very little value added - to go further (i.e. p=2^50 ??), we might even consider shifting relatively more powerful machines to P-1 for some time until the next prime finally pops up, n at PRP increases significantly, or a faster sieve client comes in handy. This is, of course, one of the worst case scenarios (I hope).

Joe O

11-25-2004, 08:28 AM

As Ceselb can tell you, maintaining the coordination thread is a lot of work. The fewer passes we do, the fewer threads, the less work for the coordinators. Though I think that we should wait before starting a high-n sieve effort, I have uploaded SoB.dat_20M_100M_S25G.zip to the kerobertsdatsdatfiles group under Yahoo. Thanks to Nuri for allowing this.

I have the factors from 25G to 26G for 20M_100M, but these are not reflected in this file. VJS has the factors from 26G to 175G for 20M_80M and is working on the factors for 80M_100M. When he sends them to me, I will create a new sob.dat file with his factors removed as well.

Happy Thanksgiving!

ceselb

11-25-2004, 09:21 AM

Originally posted by Joe O
As Ceselb can tell you, maintaining the coordination thread is a lot of work.
This is very true. I'm reluctant to getting involved in more coordination unless it's really needed.
It's good that you're thinking about it, but until we reach 17-18M I don't think there's any rush. Tests at that size will take a really long time to finish.

Chuck is also going to implement an automated sieve, so we should wait for that imho.

vjs

11-25-2004, 09:33 AM

Joe,

I sent you a mail regarding the factors, let me know if you can pick them up that way.

Were you going to send me the 80m<n<100m dat so that I can close the gap from 26-175G quicker?

I can also send you the out of range ones but I'd prefer to simply sieve that region with the 80 to 100m

Nuri

11-25-2004, 12:43 PM

BTW, if we get a sieve client thats 8x times faster, it will also mean current sieve effort will be 8x times more productive, i.e. any machine will get 8x times more factors per day, which would make sieving current dat deeper more meaningful.

Nuri

11-25-2004, 12:46 PM

I would be cool to see cumulative sieve factors eliminated per day to catch, or even pass, that of PRP tests completed per day again.

vjs

11-26-2004, 10:08 AM

Lets see we average around 40 most days so 8X would be 500 tests. That would be really easy to pass prp.

Also if we were to get a sieve client 8X faster I think you would see alot of people in prp switch to sieve. I know alot of people on my team are actually running some fast athlon processors in prp, :-( . Fast enought that they can still complete Wu's in a reasonable fastion, imagine if these people switched to sieve.

I would think a 100m dat upto several 1000T within a year would be more than possible.

vjs

12-01-2004, 12:51 PM

How about an update on that missed factor at 95G...

Death

12-02-2004, 03:19 AM

vjs, you want to have 2 parallel seiving projects or you want to completely switch?

how long it takes to admins to patch script to accept 20+M n's?
and database will require some changes...

royanee

12-02-2004, 04:21 AM

Nothing would need to be changed really, just configured. We've had multiple sieving efforts before (300k - 3M), (3M - 20M), (1M - 20M). We aren't anywhere close to needing to sieve deep in 20M+, but if people want to get a head start, that's fine. Just make sure to coordinate however you have to, but keep it small (read: high quality) until it is added to the main sieving pages (sob.com/sieve, Mike's stats). :)

vjs

12-02-2004, 12:16 PM

I have been working on this for sometime giving it thought etc. I'm not convinced it's time to switch to a higher range but I wanted to gather the data so there is less guess work involved and in gathering the data perhaps make it easier for a higher range switch.

But first let me point out under no uncertain cirumstances should be stop sieving from 1.15m (doublecheck) upto at least 20m. We are still getting alot of factors and it will be benifitical to continue the effort as it stands for quite some time.

The only question remains is should we consider sieving beyond some point (T level or another prime or twp primes) with a larger range. At that point be it 600T or 2000T all ranges reserved above that point must be sieved with a 2m<n<40m dat file for example.

I'll post findings later today.

vjs

12-02-2004, 03:10 PM

High n-sieving and extending sieve ranges a discussion to ponder and understand...
For the past week or so Joe_O and myself have been looking at the effects of high n-sieving evaluating clients and the efficiency of doing so. This project as a whole is multifaceted so what’s best for sieve might not be best for the project.

It’s always a trade off between how fast do you want results now vs the efficiency of doing things slowly and correct the first time. Finding the exact perfect spot for your tent setting it up pointing true north on flat ground while you do the camp sight might take the better part of a day. But if you want to throw it up any old way it can be done quick in less than 5 min. Which one is better??? Depends if it’s raining I guess J

Definitions:

Sieve Range – This is the difference between the highest and lowest n, for all k/n’s tested in the Sob.dat file. 1m<n<20m [1-20] is a 19m range containing all k/n pairs that require testing with the main client.

The Sob.dat file contains all of the k/n pairs in a particular sieve range which we have not found factors for yet. It’s larger with a larger range and will decrease in size when k/n pairs are eliminated through factoring.

Sieve speed – The speed of the client reported in kp/s, ie 500 kp/s, means that 500,000 p’s are tested against the entire dat file in one second.

p’s – These are prime numbers which are checked against the k/n pairs to see if they are factors for those k/n pairs. If it’s a factor that k/n cannot be prime, a prime has no factors but can be a factor of a larger number.

1/Sieve speed – Inverse of the sieve speed, or the number of seconds required to sieve a number of p across a dat file. Example I will be using 1Gp/s or 1,000,000 kp/s if your speed is 200 kp/s or 0.0002 Gp/s it would take you 1/0.0002 = 5000 sec to sieve 1G over the dat file.

1/Sieve speed/Range – The above divided by the size of the range in m. If you were able to sieve at 200kp/s over a 20m range vs a 40m range it would take you the same time to complete the range but you would do twice as much work.

Example

50 sec / Gp / Mrange = It take 50 second to sieve 1G over a 1m range such as 2m<n<3m.

The lower this number the more efficient the client, if this value stays the same… if you do 1G over 1Mrange or 0.1G over a 10Mrange, the amount of sieving work done is the same but the time it take to cover a range of G will be different.

O.K. Here is what we have found.
All of the data collected by myself was completed on a 2500 Mhz Barton Athlon chip an nforce2 single channel with 256mb of memory on windows 2000 running proth 0.42. Most data from 1M up to 25G was done by Joe_O.

Joe_O did most if not all of the dat files, removing k/n pairs with factors etc.

We both have all fact.txt, factrange.txt, for the range of 20m<n<100m sieved from 1m or 0.001G to 175G.

I have all of the factexcl.txt on my machine over 300mb.

The 20m<n<100m dat file has been reduced in size from over 30mb to less than 17mb by removing all factors found with sieving up to 175G.

Now for the data and what happened.

The original dat files used were taken from Keroberts yahoo group and created by Nuri. That of 20m<n<80m was sieved by myself from 1Gto175G.
Joe used the 20m<n<100m dat file and sieved it from 0.001G to 25G.
Joe later created a 80m<n<100m which I sieved from 25G to 175G.

Joe used various programs he can comment on, I used proth 0.42 the most recent one as of dec1, 2004 exclusively.

Large portions of the ranges I sieved created out of range factors, factrange.txt. Coincidentally sieving the 20m<n<80m data file, factrange.txt already removed 10% of the factors which were found in the fact.txt for the 80m<n<100m. Fact.txt for 80-100 didn’t miss any of the factors the 20-80 factrange.txt found.

Note In sieving the 20-80 dat a factor for a k/n pair where n=11m was found in factrange.txt. This factor should have been eliminated by the sieve effort already using the 1-20 dat file, the user for some reason missed it. Sieving that particular range found the factor with the current proth and dat.

Joe removed a tremendous number of factors from 1m-25G, and more by myself from 25G to 170G. Ranges were double-checked for missed factors dividing up the data’s into different ranges re-sieving portions with updated dat and checking for new factors, no missed factors were found through this method.

The means proth finds factors equally well or badly over a range no matter how it is divided up.

Early sieving using the 20m<n<80m dat file was slow using proth ~100 kp/s however, speeds increased drastically by 90G (340 kps) and continued to increase to around 370 kp/s by 175G. Benchmarks were done at 170-171G with the updated dat files.

Proth sieve will not sieve a range much larger than ~70M i.e 1m<n<70m. Removing k’s from the dat file doesn’t extend the range, function seems to be more range size dependant than range value or number of k.

Data

Range (M) R.Size (M) kp/s sec/Gp sec /Gp per range(M) sec /Gp per 40M sec /Gp per 20M
1.1m-20 m 18.9 622 1608 85.1 3403 1701
80m-100m 20 609 1642 82.1 3284 1642
20m-50 m 30 567 1764 58.8 2352 1176
20m-60 m 40 537 1862 46.6 1862 931
60m-100m 40 523 1912 47.8 1912 956
20m-80 m 60 368 2717 45.3 1812 906
20m-100m*80 199 5025 62.8 2513 1256

All range speeds are at 170G error is less than 2%
*- Range completed with sobsieve

From the above table one can see that as range size increases the sieve speed decreases

Sieving the same range size at high n ie 1.1-20 (yes a 19m range I know) vs 80-100 or 20-60 and 60-100 has little effect on sieve speed <5% difference.

sec /Gp per range(M) – is a measure of efficiency although the speed may decreasing for a larger range it is more efficient since more work is done.

Sobsieve is less efficient than proth 0.42 even tough it can sieve larger ranges.

Sieve efficiency increases drastically with n-range tailing out by a 40m i.e (20-60) or potentially a (1m<n<40m).

The efficiency of a 60m dat isn’t much better than a 40m bat.

Second table combining ranges and efficiency

More data combining ranges.

If one were to sieve from 20m<n<100m, it could be done in several ways with the above figures/values, using sobsieve, or splitting the data file into parts: 20m<n<60m and 60m<n<100m, or 20m<n<80m and 80m<n<100m, or 20m<n<40m, 40m<n<60m, 60m<n<80m and 80m<n<100m.

Range (M) R.Size (M) kp/s sec/Gp sec /Gp per range(M) sec /Gp per 40M sec /Gp per 20M
20m-60 m 40 537 1862 46.6 1862 931
60m-100m 40 523 1912 47.8 1912 956

Combined
20m-60m-100m 80 N/A 3774 47.2 1887 944

Range (M) R.Size (M) kp/s sec/Gp sec /Gp per range(M) sec /Gp per 40M sec /Gp per 20M
20m-80 m 60 368 2717 45.3 1812 906
80m-100m 20 609 1642 82.1 3284 1642
Combined
20m-80m-100m 80 4359 54.5 2180 1090

As you can see from above from a sieve perspective, dividing into two equal ranges of 40G is most efficient, since it is using 2 40m date files, the use of a 20m and a 60m isn’t that much worse it only takes about 20% longer.

However dividing into 4 equal 20m ranges produces

Range (M) R.Size (M) kp/s sec/Gp sec /Gp per range(M) sec /Gp per 40M sec /Gp per 20M
1.1m-20 m 622 1608 85.1 3403 1701
20m-40 m est 1608 85.1 3403 1701
60m-80 m est 1642 82.1 3284 1642
80m-100m 609 1642 82.1 3284 1642

Combined
1.1-20-40-60-100 6500 81.3 3254 1627

This would take 1.8 times as long or a time increase of 78% over two 40m ranges.

What this means is if we needed to sieve the entire range from 20m<n<100m for all k in the shortest amount of time up to some specified T two 40G ranges would work best. If we needed to sieve from 1m<n<100m two equally divided ranges 50G would be most efficient.

The best scenario would be a client able to sieve the entire range in a more efficient matter. If proth were capable using one 100m range that may be best but the increased efficency sec /Gp per range(M) may only work out to around ??43??.

Before anyone draws conclusions on sieve ranges etc, one must also realize that as k’s are eliminated the sieve speed will increase, decreasing the efficiency of sieving higher n that will never be needed.

However it was predicted that only 3 more primes will be found by 20m increasing the current sieve speed by ~ 3/11ths from current

7th 6.5M-10.1M
8th 10.1M-16.8M
9th 16.8M-29.5M

and only two between 20m-100m

10th 29.5M-57M
11th 57M-127M

If we keep on eliminating K's with primes sieve speed will continute to increase but not double by 100m

Question comments further explaining, corrections etc. (other than grammatical of course :-) it’s late...

Mystwalker

12-02-2004, 05:19 PM

Originally posted by vjs
Sieve speed – The speed of the client reported in kp/s, ie 500 kp/s, means that 500,000 p’s are tested against the entire dat file in one second.

p’s – These are prime numbers which are checked against the k/n pairs to see if they are factors for those k/n pairs. If it’s a factor that k/n cannot be prime, a prime has no factors but can be a factor of a larger number.

Just a little correction:

p's in the speed-sense are just plain numbers (which hopefully are factors). The high performance we experience now is partly due to the fact that non-prime numbers can be left out (omitting all even numbers double the speed and so on).

So although only primes are tested for trial factoring, the speed is metered just in numbers...

vjs

12-02-2004, 05:44 PM

Mystwalker,

It was/is my understanding that the reported speed took this into account, or number of tested p in a 1G of sieve range doesn't change much per sieve range anyways.

Regardless all data was collected using the same sieve range " T ", so the number of p in a range wouldn't change the above data points or analysis just the number of factors found in a range of p's. I could repeat a small ranges around 600T if someone is really dismissing the above due to T level. But thanks for pointing out the error.

^
|
|

"I'm never wrong, but there was one time I though I made a mistake, but I was mistaken" :rotfl: :rotfl: :rotfl:

If anyone goes through and understands this analysis, do you agree with the following.

Proth %eff is maximized using ~40m ranges... and does the following make sence for the analysis.

The decrease in time "sec /Gp per range(M)" as you approach a 40m range size starts to be offset by the size of the range "sec/Gp", so maximum efficency looks to be in the range of a 40m range size???

Note this only concerns sieving I don't want to bring in reducing the number of k yet.

royanee

12-02-2004, 06:20 PM

It would be interesting to see how that other sieve client reacts to the data. Since you have so much information about the factors from .001 G to 175 G maybe testing it on the new client I heard mention of would be able to give some evidence towards its ability to find factors correctly. With that client, doing an 80M dat might not be a problem. Still though, unless we see a huge jump in PRP (:Pokes: ), it will be a while before this work will be (1) needed, and (2) effective. With an estimated 2-3 primes before we get to that horizon, that's 2/11 or 3/11 percent of your work invalidated. I don't think that thinking about 20m to 100m is a bad thing. Instead, it is very good to plan ahead. It's just that right now, if you want to put 27.3% of your cycles into the project and have it possibly be wasted... (Not trying to be mean, but you understand what I'm saying.)

Keroberts1

12-02-2004, 06:24 PM

2 or 3 11ths would not be wasted onely the factors found for larger N values than where the prime is found. Also, a prime doesn't sped up the sieve by one 11th it will only end up being (I believe around 7% so in reality the effect of finding primes on the sieve is only measured in the amount of factors found.

ShoeLace

12-02-2004, 06:28 PM

vjs,

I would agree with your statement that

If anyone goes through and understands this analysis, do you agree with the following.

Proth %eff is maximized using ~40m ranges... and does the following make sence for the analysis.

but more then that, proth efficiency is maximized by using the minimum number of equal sized dat files over a given range.

so for 20m-100m a range of 80m.
a single 80m range is most efficent
BUT since the client doesn not handle 80m ranges (or handle well)
2 * 40m is teh next most efficient

likewise

20-200m -> 180m range
again a single 180m range would be best
but failing that, and assuming the current clients
3* 60M is left as an efficent option.

which i guess raises the question (no begging here)
to which i may have just missed the answer.. WHY 100m?

if we can(using prothseive) use 60m ranges.. why not 20-80m and 80-140m ?

Keroberts1

12-02-2004, 07:24 PM

the idea is taht the range needs to be expanded exactly how much is uncertain. However, a range from1-80m will take about 20% less time to sieve than a 1-160m range. but to sieve both a 1-80 million range and a 80-160 range would take about twice as long as the 1-80m range and 66% longer than the 1-160m range.

ceselb

12-03-2004, 02:23 AM

Some minor errors in reasoning, but in the scope of these calculations it doesn't matter.

I'd like to add one thing however:
Available memory will play in more as n range grows.
It will take atleast 2-3 years to reach 20M (possibly more).
Most new computers will have atleast 1Gb of ram at higher bus speeds than now.
Based on that I'd say that splitting isn't a good idea, keeping 20-100M is better imo.

Death

12-03-2004, 03:23 AM

well, as I understand the n's in sieving are the same as n in regular sob test...

so why we sieve 20M not 10M? main effort will catch up n=10M in a year or two. maybe we should try 5-10M range? vjs, can you make some research toward this range?

larsivi

12-03-2004, 03:55 AM

Originally posted by Death
well, as I understand the n's in sieving are the same as n in regular sob test...

so why we sieve 20M not 10M? main effort will catch up n=10M in a year or two. maybe we should try 5-10M range? vjs, can you make some research toward this range?

Because we want to sieve as much as possible before PRP start at all. The large size of the future tests combined with the fast return of new factors means that we should sieve all ranges where we will do PRP as much as possible before starting the PRP-testing. The low n ranges had a lot more PRP-tests than the ranges we do now because of the increased number of factors through sieving (and factoring).

Mystwalker

12-03-2004, 06:09 AM

Originally posted by Death
so why we sieve 20M not 10M? main effort will catch up n=10M in a year or two. maybe we should try 5-10M range? vjs, can you make some research toward this range?

It's more efficient to do e.g. 5M-20M than 5M-10M, 10M-15M and 15M-20M.
5M-20M is not that much slower than 5M-10M - as you can see below, doubling the range slows down sieving speed by < 20%, whereas it would be 50% when you do those ranges one after another.

So in the short run, it would be better to just concentrate on the current PRP range. But as this window moves along, we'll lose effort in the long run...

vjs

12-03-2004, 12:40 PM

A lot of really good questions and responses here and I’ll try to answer a lot of them in a few posts,

Post from Death,

can you tell something about 5-10M range?

Yes, Death I did a few more short tests to find actual numbers for you…

Examples

Range (M) R.Size (M) kp/s sec/Gp sec /Gp per range(M) sec /Gp per 40M sec /Gp per 20M
7.56m-7.58m 0.02 866 1155 57736.7 2309469 1154734
0.3m-3m 2.7 700 1429 529.1 21164 10582
1.1m-20m 18.9 622 1608 85.1 3403 1701
20m-60m 40 537 1862 46.6 1862 931

These are just for the dat’s we have available since they do take time to create etc.

I’m going to use the current dat as a reference. There are two other dat’s that we have tested fully, the low n sieve range from 300k-3m, it’s a 2.7M range and sieves at 700 as compared to ~622 on my machine, the other a 7.56-7.58 dat, a 0.02M range which sieves at 886 kp/s.

As you can see they are fast but it is the speed of the client over a particular dat size that’s important. If you start looking at the “sec /Gp per range(M)”, the numbers are terrible. Sure you cruise through a 100G range looking for factors in the active window 7.56m-7.58m in 70% of the time. But we wouldn’t be able to sieve very deep before the factors fall into double check. The effect is less so with a 2.7M dat saving only 13% of your time.

The point above is that you would not be able to sieve very deep with a small range before the factors are only usefull for double checks this is true for very small dat’s.

Let’s look at this another way, say we sieve from 1m-40m, those 20m-40m dat’s won’t be used for a long long long time but if we did them now, sure it would decrease our speed but how much… 14%. So if we know for certain that we will sieve again from 20m<n<40m with the exact same client, by sieving the two ranges in conjunction as a 1m<n<40m we save ~86% of our time.

On an aside and I believe, Keroberts1 commented on this. On the 20m<n<40m effort, some of those factors produced won’t be used. Which ones, the ones we find primes for, and only those k/n’s above the prime and below 40m. Also it’s predicted that 3/11th’s of them will fall by 20m, but that’s a guess, might be 4 could be 2, off chance and highly unlikely none. My estimates were very close to Keroberts1, I though 8% ... we would have an effective 8% loss of efficiency due to primes being found with a 1m<n<40m from here on.

ShoeLace,

I see you understand my post fully. Yes the largest range that sieves by proth is the most efficient. So sieving a 60M range would be the best.

However our client slow down or the time it takes to finish ranges is unacceptable IMHO for a 60M. The increased efficency of the client is out weighted by the time it takes to finish a range and produce factors that we will use within ~2-5years not more. After all the goal of sieve is to eliminate prp testing in the long run to make that effort more efficient.

If we were to start now for above 20m sieving only, then yes 60m or the largest possible. But if we were to simply extend the current dat. Then I'd say only extend it to around a 40m range maybe slightly more perhaps less, but 30m is too small.

Basically what it boils down to is how deep we want to be in sieve T, how far ahead we expect our next couple primes, and how fast our current client is. Of course you know this and I see what your getting at… So if we have to sieve in two ranges, we will of course sieve the smaller one first as far as we need to then start the second as prp first pass approaches.

And to answer the question why 100m… because that’s what we had available with dat’s. If I had dat’s to 1000m I would have tried to get an end point in the graph but I think everyone can agree that sieving to 1000m would be insane. (Hold on let me check with myself and I, yes all three of us are in agreement :-)

Also now is definitely not the time for 20-80m, but extending from 1m-60m would be more inline and reasonable. But I do think 60m is too far, and also only continuing from our current T.

Ceselb,

I’m not sure regarding the errors in reasoning, could you point them out…

As for the clients consumed memory, the current dat file consumes ~25Mb, the 20-80 dat consumes ~43Mb, so it’s an additional 22Mb.

I think most machines running proth right now wouldn’t suffer from an additional 22mb. Also I’d say 80% if not more of the “effort” currently dedicated is from machines with at least 256mb already and that number will increase as you said. I also ran all of the tests on a 256mb system just for that reason.

I agree I also think 2 years might be a little optimistic, 3 years is more plausible.
So that basically means that we have about 1 year maybe more before we need to start the 20m+ sieving, if we want to get to any appreciable depth.

Mystwalker,

I agree with your last post except for the 5M-20m comment which I’m sure you didn’t intend too suggest. The lowest n we should sieve for should always be the double check prp. Or 20m from 0T to where we leave the 1-20m dat.

Mystwalker

12-03-2004, 12:50 PM

Originally posted by vjs
Mystwalker,

I agree with your last post except for the 5M-20m comment which I’m sure you didn’t intend too suggest. The lowest n we should sieve for should always be the double check prp. Or 20m from 0T-where we leave the 1-20m dat.

Of course, we should fully include PRP double checking. The 5M lower bound was "e.g." and originated from the "5M-10M" in Death's posting.

vjs

12-03-2004, 12:58 PM

Yes Mystwalker, and thanks, its always good to hear confimation and additional explainations from others.

Should I post my excel files and graphs for:

sec /Gp per range(M) calculations etc.

vjs

12-03-2004, 03:50 PM

Here is a graph, more data is need in the 30-50M range and a largest dat file "end point".

I cut the d off the end of the "speed" label.

Everything is normalized,

Speed - The actual sieve client speed, higher the better of course

Speed per 1M range - The total speed of the client divided by the range size, the higher the better

Optimal Range Size- This take some explaining, it correlates the actual speed of the client vs the Range speed of the client.

The Optimal Range is Basically a plot of the effective speed of the client but the lower the number the better.

Also everything crosses over at ~19M b/c that's what everything is nomalized to.

vjs

12-07-2004, 02:59 PM

O.K. my last post on this topic for sometime unless someone wishes to discuss this issue.

Here is the final graph below, the y axis represents the % change relative to the current dat size. The x axis represent the dat size, (i.e. a 1.1M<n<20M dat our current dat has a 18.9M size)

As you can see everything crosses at a 18.9m dat size with 100% relative efficiency, this is simply b/c I normalized everything to the current dat.

A couple interesting conclusions can be drawn:

First in our testing of proth, the size of the n-values within the dat do not seem to effect client speed.

- a 1M<n<21M, 20M<n<40M, 40M<n<60M, 60M<n<80M, and 80M<n<100M, all run at about the same speed.

- Decreasing the data file size does decrease the amount of time required to finish a range, however our current dat is quite small to begin with.

Our dat file would have to reduce from the current ~18.9m size to approximately a 5m dat to see a 10% increase in sieve speed. (i.e a 8m<n<13m would only be 10% faster).

- Increasing the dat file size does increase the time required to complete a range.

This is the blue trace on the graph, a 1m<n<41m dat would require roughly 20% more time to complete the same range compared to the current client.

A 1m<n<50m dat roughly 25% more time.
A 1m<n<55m dat roughly 35% more time.

However due to some effect proth has with very larger ranges.

A 1m<n<63m dat would take twice as long to complete, 100% more time.

This can be seen as the sharp increase in sieve time of the blue plot.

As for the pink plot this represent the time required to sieve a range divided by the size of the dat file.

Since sieve time (or client speed) is not increasing or decreasing dramatically with ranges less than 50m. Sieving a 20M may take 100 minutes but sieving a 40M range would take 120minutes. So in effect the amount to time spent sieving the first half of the range was only 60minutes and 60 minutes on the second half. Of course you have to do both halves of the dat at the same time, but in essonce you sieve what you would have sieve in 100minutes in 60 minutes, but you have to spend another 60 sieving the other half.

This can be seen in the decrease in the pink line with dat file size however some effect takes over at around 55m increaseing the time again.

What this leads to is a maximum efficiency of the client.

The yellow plot represent this efficency,

at approximately, a 53m dat file size the client efficency has peaked at over 200% more effective than the current dat.

So what this means is from purely a sieving stance we could sieve a dat file from 1m<n<55m (over 2.5x larger) in only about 30% more time and be over 200% more efficent with our sieve effort.

The only drawback in doing so is we would decrease our current factor production for the 1m<n<20m range by 30%. However we would get all of those between 20m<n<55m and wouldn't have to resieve this range.

Of course there are other factors at work, primes, prp time, etc.

What I propose is at some point if we are still using proth, extend the current effort from 1M<n<20M, to 1M<n<50m and only continue from the current T level at that time.

Of course if we decide to wait for a prime the 30% decrease in sieve speed would be off set by a ~10% increase from the eliminated k.

It's also important to note that we should find two more primes before we reach 20m but probably not 3.

vjs

12-07-2004, 03:10 PM

The graph for above

vjs

12-15-2004, 04:58 PM

O.K. one more post,

The Y-axis is total factors found removing k/n pairs from testing
vs
sieve depth in T.

basically what this shows is

Sieving from 0T- ~4T removed 250,000 factors
Sieving from 0T- 10T removed 265,000 factors
Sieving from 0T- 99T removed 312,000 factors

So sieving from 10T to 100T removed an additional 17% or 47,000 tests...

The plot basically shows the fall off in useful factors with T.

Regardless, sieving removes morek/n pairs than prp in the same amount of time until at least 5000T, we are currently at about 580T.

ceselb

12-15-2004, 07:46 PM

Originally posted by vjs
Regardless, sieving removes morek/n pairs than prp in the same amount of time until at least 5000T, we are currently at about 580T.

If we can go as fast above 1125T (2^50).

vjs

02-13-2005, 01:26 PM

Update,

Just to update everyone, we have almost doublecheck all sieve ranges less than 25T with a 991<n<50M dat. This dat is slower than the 1.5M<n<20M dat but not by much considering it's size and covers all of the k/n pairs contained the the 1.5M<n<20M dat as well. There are a few holes left unsieved <25T but not many and we will finish these shortly.

Please consider all sieve ranges <25000G doublechecked and nolonger in need of retesting.

0-25000 [completed] (a double check sieve for the current dat)

An update on our progress...

In doing this sieve 0-25000G we have been collecting factrange for n values over 50M and applying those to a 50M-100M dat. (May as well keep them) By sieving up to a little less than 25T we have managed to eliminate almost half of the k/n pairs we orginally started with it's quite suprising. We have also found quite a few missed factors in the 1.5M<n<20M range. Is this a good use of our computing power? Questionable, but we feel the effort is valid for the missed factors and our curiosity.

I'd ask that nobody at this time double check any sieve ranges without posting first. We may have already sieved the region you have in question. We would also like to keep track of which ranges that have been doublechecked.

Thanks for your time.

vjs

02-18-2005, 02:00 PM

Update on the high-n sieving,

For the past few weeks Joe_O, myself, a very few others have been doing some high n sieving. We have gone through various dats etc as you can see from previous discussions and are currently sieving a dat with a size of ~50M. (This is near optimal from a pure sieve stance, optimal is ~53M).

The dat we are using is 991<n<50M ~2.7 times larger than the 1.5M<n<20M dat currently in use by the maineffort. This 2.7X size increase reduces the client speed by roughtly 15% when compared to the 18.5M dat in current use.

Since our dat overlaps the current effort we have found several factors missed by the previous and current efforts.

What we have sieved so far is all p=~64k to p=7T with a 991<n<50M dat several holes exist but basically all p<25T are either inprogress or complete.

In addition to this low-p sieving we have been collecting large factors >20M and small factors <1.5M from the factrange.txt files, submitted from those people who chose to submit factrange.txt factors to factrange@yahoo.com. We have taken these factors as well as factrange from our efforts and applied them to Joe's database of all unfactored k/n pairs in the range of 0<n<100M. Currently the smallest unfactored n is n=991.

In order to reduce the number of total factors in this database we have also sieved 50M<n<100M to approximately 3T b/c the factor density is so high at low p.

Our results...

We have managed to reduce the 991<n<50M dat size from >20mb to approximately 8.1Mb currently, also the memory requirement for running proth sieve have been reduced from a high of ~50Mb to ~32.5Mb currently while speeding up the client.

Table of results...

See above post for descriptions of table

(n>) (n<) Start Now T.Fact 10K 2.5T 3T 3T+ 5T+
0 1 28187 27609 578 39 17 139 251 132
1 3 53908 53158 750 23 22 76 378 251
3 8 131984 131369 615 0 0 284 143 188
8 10 53115 52847 268 0 0 124 61 83
10 20 265330 264119 1211 240 335 0 284 352
20 30 648872 300372 348500 331271 6492 0 7520 3217
30 40 648663 301172 347491 330829 6236 0 7610 2816
40 50 649463 302275 347188 330923 6099 0 7371 2795
50 60 649117 312789 336328 318159 11629 0 5938 602
60 70 648603 315006 333597 315355 12319 0 5696 227
70 80 648590 315497 333093 310861 16388 0 5712 132
80 90 648497 314856 333641 310689 17239 0 5639 74
90 100 648923 315669 333254 310061 17379 0 5792 22
- Sum 5723252 3006738 2716514 2558450 94155 623 52k ~11k

(n>) (n<) Start Now T.Fact 10K 2.5T 3T 3T+ 5T+

0 1 28187 27609 578 39 17 139 251 132
dat % 100 97.95 2.05 0.14 0.06 0.49 0.89 0.47

1 20 504337 501493 2844 263 357 484 866 874
dat % 100 99.44 0.56 0.05 0.07 0.10 0.17 0.17

0 50 2479522 1432921 1046601 993325 19201 623 23618 9834
dat % 100 57.79 42.21 40.06 0.77 0.03 0.95 0.40

20 50 1946998 903819 1043179 993023 18827 0 22501 8828
dat % 100 46.42 53.58 51.00 0.97 0.00 1.16 0.45

50 100 3243730 1573817 1669913 1565125 74954 0 28777 1057
dat % 100 48.52 51.48 48.25 2.31 0.00 0.89 0.03

0 100 5723252 3006738 2716514 2558450 94155 623 52395 10891
dat % 100 52.54 47.46 44.70 1.65 0.01 0.92 0.19

I can't seem to get the formatting correct on the board

As you can see the total number of unfactored k/n pairs has reduced by a little more than 47% 991<n<100M.

We have reduced the number of k/n pairs between 20M<n<50M by 53.5%, more than half.

Perhaps Joe can comment regarding factors missed by the maineffort which would require testing. I know we found >100, but alot of them were between firstpass and second pass <3M, however more than an handfull have been above firstpass.

If anyone has questions or comments?

P.S. I'd like to personally thank all of you who have submitted your factrange.txt files.

Death

02-18-2005, 08:11 PM

what are you do with this missed factors? submit in a http://seventeenorbust.com/sieve?

how they scores?

you are welcome.

vjs

02-19-2005, 12:50 PM

Death,

How do they score?

As for score... You bascially get close to zero I.E. no points

a typical factor within the currrent sieve window gets
~3,000 points for secondpass
and >150,000 points for a firstpass factor

The factors we submit get score = p/1T doesn't matter if it's above below first or secondpass or even if it eliminates a test from being done. So what this means is we get a score equal to the p in T.

So currently about 20 points for a factor, we are not interested in scoreing whatsoever its not what our effort is about.

how do you submit them

Well not easily, since the minimum sieve p-value sometimes changes we are very careful with our submissions and make sure they go through. Often times Joe_O submits more than once and we have ways of telling if they were applied correctly, such as this page http://www.seventeenorbust.com/secret. WE also submit them in large batches.

Also those factors above 20M are not recorded by the server at all!!!

We are personally collecting them archiving them in several places by two or more people, and we will send them to Louie, Kungao, when they decide to start testing/sieving above 20M.

People may not get personal credit for those tests done above 20M currently but we are recording who did what, which is why people are testing in blocks of 1T (1000G). It is possible to get personal credit for those below 20M if the user chooses. I and the other personally don't care much about a few 100 points, we get Joe to submit almost everything on our behalf.

Here is a list of our most recent factors, as you can see a little over 200 factors score less than 2000 points. Notice that these are all factors above doublecheck 1.5M.

[code]

21.236T 4847 6801471 21.236
21.233T 4847 17884527 21.233
21.202T 10223 15831401 21.202
21.192T 55459 2804386 21.192
21.186T 4847 4455927 21.186
21.182T 10223 12081017 21.182
21.181T 55459 2825614 21.181
21.180T 19249 10636106 21.18
21.176T 22699 2264662 21.176
21.169T 24737 180847 21.169
21.169T 24737 2481223 21.169
21.165T 4847 10791207 21.165
21.158T 10223 19529549 21.158
21.147T 10223 16308941 21.147
21.146T 4847 14639871 21.146
21.141T 4847 18128271 21.141
21.133T 4847 7851951 21.133
21.129T 33661 1811736 21.129
21.127T 10223 17530685 21.127
21.118T 10223 9358169 21.118
21.114T 10223 6568781 21.114
21.111T 4847 6420447 21.111
21.110T 4847 19081167 21.11
11.945T 10223 3020357 11.945
11.944T 10223 11330669 11.944
11.416T 24737 3728743 11.416
11.260T 10223 18058757 11.26
10.680T 24737 9804943 10.68
10.519T 33661 14470008 10.519
9.411T 10223 1656749 9.411
9.408T 21181 1627940 9.408
9.385T 55459 2108578 9.385
9.373T 33661 1689792 9.373
9.368T 21181 2696492 9.368
9.365T 4847 1576551 9.365
9.348T 33661 2927040 9.348
9.342T 55459 2990926 9.342
9.328T 55459 1654174 9.328
9.327T 55459 2229430 9.327
9.316T 19249 1930226 9.316
9.311T 19249 1882382 9.311
9.309T 24737 2465191 9.309
9.305T 4847 1723023 9.305
9.294T 55459 1712134 9.294
9.285T 10223 1792745 9.285
9.267T 10223 2377289 9.267
9.264T 10223 2826725 9.264
9.247T 24737 1668703 9.247
9.230T 19249 1765562 9.23
9.224T 19249 1936706 9.224
9.212T 33661 1978488 9.212
9.190T 19249 1968602 9.19
9.183T 4847 2177631 9.183
9.172T 55459 2866906 9.172
9.126T 24737 2271151 9.126
9.120T 22699 2849950 9.12
9.116T 55459 2114038 9.116
9.115T 33661 2377296 9.115
9.104T 10223 2527301 9.104
9.103T 10223 2407085 9.103
9.100T 27653 1853601 9.1
9.087T 19249 2725826 9.087
9.041T 4847 2325351 9.041
9.041T 21181 1759412 9.041
9.040T 21181 2763404 9.04
9.039T 33661 2617656 9.039
9.016T 55459 2903278 9.016
9.015T 27653 2788377 9.015
9.002T 4847 2363391 9.002
9.001T 21181 2006900 9.001
8.630T 22699 17164702 8.63
8.282T 55459 2245114 8.282
8.277T 10223 1804409 8.277
8.264T 55459 2873830 8.264
8.256T 24737 1726831 8.256
8.255T 33661 1994352 8.255
8.250T 19249 2797538 8.25
8.245T 10223 2041769 8.245
8.237T 55459 1529050 8.237
8.229T 33661 2164992 8.229
8.226T 33661 1865256 8.226
8.219T 24737 2468863 8.219
8.208T 24737 2915983 8.208
8.186T 55459 1768774 8.186
8.185T 27653 1690305 8.185
8.174T 27653 1568769 8.174
8.174T 33661 1592808 8.174
8.172T 24737 2538943 8.172
8.170T 33661 1615536 8.17
8.168T 24737 1732663 8.168
8.159T 21181 2513132 8.159
8.149T 33661 1933440 8.149
8.148T 33661 2201184 8.148
8.133T 55459 2919910 8.133
8.132T 4847 1501503 8.132
8.097T 27653 15115209 8.097
8.091T 33661 2227056 8.091
8.085T 10223 2075849 8.085
8.078T 55459 2552098 8.078
8.066T 19249 1760558 8.066
8.063T 67607 2910011 8.063
8.059T 27653 1681593 8.059
8.056T 21181 2559548 8.056
8.043T 24737 1555303 8.043
8.037T 33661 2674224 8.037
8.033T 10223 1913261 8.033
8.020T 4847 1745583 8.02
8.017T 10223 2835881 8.017
8.015T 55459 2843326 8.015
8.012T 10223 2575037 8.012
8.003T 55459 2195506 8.003
8.000T 27653 2389281 8
5.998T 19249 2286878 5.998
5.987T 55459 2182558 5.987
5.976T 21181 2155772 5.976
5.965T 22699 1698742 5.965
5.965T 21181 2665340 5.965
5.953T 24737 2493871 5.953
5.944T 55459 2678518 5.944
5.943T 55459 1943686 5.943
5.941T 55459 1515118 5.941
5.932T 55459 1515106 5.932
5.929T 10223 1526441 5.929
5.927T 55459 2456950 5.927
5.926T 10223 2297957 5.926
5.923T 19249 2916986 5.923
5.919T 55459 2096434 5.919
5.915T 10223 1718969 5.915
5.904T 10223 2071877 5.904
5.900T 21181 2499884 5.9
5.899T 27653 2688153 5.899
5.899T 4847 2966871 5.899
5.896T 55459 2123758 5.896
5.888T 55459 2192986 5.888
5.877T 27653 2582241 5.877
5.871T 27653 1995729 5.871
5.869T 67607 2187251 5.869
5.869T 24737 1892671 5.869
5.853T 24737 1569631 5.853
5.848T 21181 2328980 5.848
5.833T 27653 1747617 5.833
5.831T 33661 2213448 5.831
5.816T 55459 1814806 5.816
5.813T 10223 1525001 5.813
5.810T 55459 2661958 5.81
5.805T 33661 2346936 5.805
5.799T 4847 2580111 5.799
5.793T 4847 2366511 5.793
5.778T 21181 1825820 5.778
5.772T 21181 2705324 5.772
5.763T 33661 2400648 5.763
5.757T 21181 2864852 5.757
5.751T 19249 2048702 5.751
5.749T 4847 2093583 5.749
5.744T 55459 1547218 5.744
5.734T 67607 2067507 5.734
5.724T 67607 2937371 5.724
5.721T 55459 2548750 5.721
5.716T 55459 2984194 5.716
5.706T 55459 2716426 5.706
5.694T 24737 1721503 5.694
5.689T 27653 2166297 5.689
5.685T 19249 14806706 5.685
5.684T 24737 1982191 5.684
5.669T 10223 1502297 5.669
5.669T 4847 2241231 5.669
5.668T 21181 1761668 5.668
5.667T 4847 2254071 5.667
5.662T 21181 1531844 5.662
5.658T 21181 2658068 5.658
5.657T 10223 2863661 5.657
5.652T 24737 1698103 5.652
5.651T 24737 1888831 5.651
5.643T 10223 2915561 5.643
5.638T 22699 1931662 5.638
5.637T 33661 1538784 5.637
5.637T 10223 2754569 5.637
5.633T 21181 2766164 5.633
5.623T 21181 1767860 5.623
5.302T 24737 2806543 5.302
4.997T 4847 2758551 4.997
4.982T 33661 1910160 4.982
4.972T 10223 2897897 4.972
4.968T 4847 1700991 4.968
4.967T 33661 2615256 4.967
4.966T 10223 2895077 4.966
4.960T 4847 2406663 4.96
4.959T 55459 1854118 4.959
4.929T 10223 1811465 4.929
4.916T 27653 2014377 4.916
4.914T 4847 2639703 4.914
4.902T 4847 2291751 4.902
4.899T 22699 1834390 4.899
4.896T 24737 1890991 4.896
4.894T 33661 2638392 4.894
4.892T 24737 2372263 4.892
4.888T 27653 2529321 4.888
4.886T 24737 2647303 4.886
4.883T 33661 2398200 4.883
4.879T 33661 1987560 4.879
4.875T 10223 2980841 4.875
4.856T 24737 1610623 4.856
4.854T 55459 1668178 4.854
4.846T 67607 2916971 4.846
4.833T 33661 2345040 4.833
4.831T 21181 2699924 4.831
4.825T 24737 3685063 4.825
4.819T 19249 1799762 4.819
4.818T 55459 2578654 4.818
4.812T 10223 2831021 4.812
4.812T 27653 2383089 4.812
4.805T 21181 2622644 4.805
4.804T 19249 1512842 4.804
4.804T 4847 2567751 4.804
4.795T 33661 2687616 4.795
4.788T 4847 1646511 4.788
4.775T 10223 2578505 4.775
4.771T 24737 1718071 4.771
4.769T 55459 2709874 4.769
4.766T 4847 2131983 4.766
4.759T 22699 2421118 4.759
4.756T 55459 2751070 4.756
4.755T 10223 2458217 4.755
4.747T 24737 2677663 4.747
4.739T 21181 1637060 4.739
4.737T 67607 2019491 4.737
4.731T 33661 1666248 4.731
4.727T 24737 2327071 4.727
4.723T 67607 1941611 4.723
4.718T 21181 2128484 4.718

Joe O

02-22-2005, 05:20 PM

I also would like to thank those who have submitted their factrange.txt files.

Joe O

02-27-2005, 07:22 PM

Just to give you an idea of the k n pairs we have eliminated so far:

Joe O

03-10-2005, 03:33 PM

Time for an update.
Can you spot the differences?

vjs

03-10-2005, 06:05 PM

Some stats from Joe_O's latest dat run...

vjs

03-10-2005, 06:09 PM

Also a graphical representation of the removed k/n pairs...

We now have less than 3M pairs remaining in the dat, WOW quite a few but almost half of what we started with.

I'd also like to point out Joe_O's above graph represents missed factors by the main effort... there are quite a few.

ShoeLace

03-10-2005, 07:09 PM

excuse me if i'm blind but when can i get/download this 0-50M dat file?

Joe O

03-10-2005, 08:40 PM

You can find the 991-50M dat in the file section of this group:

Sierpinski Sieve (http://groups.yahoo.com/group/SierpinskiSieve/)

If anyone wishes to join this group they are very, very welcome.

You can send an email to factrange at yahoo dot com and I will send you the yahoo invitation to make it easier to join this group.

Use this dat for ranges reserved in the ordinary way. Submit fact.txt in the usual way and then send it and factrange.txt to factrange at yahoo dot com so that we can process the factors over 20M.

Enjoy!

Sierpinski Sieve (http://groups.yahoo.com/group/SierpinskiSieve/)

vaughan

03-11-2005, 01:18 AM

Joe O,
When you try to join this Yahoo Group you are presented with a security word verification thingy to type in. It is very hard to decypher. It took me 3 attempts until I managed to work it out as there are stray bits of serifs etc that messup the word. Also, since when does a word contain a numerical character? Perhaps it should be rephrased as "enter security code" instead of "word verification".

Joe O

03-11-2005, 06:38 AM

Originally posted by vaughan
Joe O,
When you try to join this Yahoo Group you are presented with a security word verification thingy to type in. It is very hard to decypher. It took me 3 attempts until I managed to work it out as there are stray bits of serifs etc that messup the word. Also, since when does a word contain a numerical character? Perhaps it should be rephrased as "enter security code" instead of "word verification".

I have no control over that. That is Yahoo's attempt to keep out 'bots'. That is why I also offered to send an invitation that bypasses that to anyone who emails me.

vjs

03-11-2005, 11:06 AM

I'd also like to point out a couple things:

We would wish that everyone reserve in increments of 1T (1000G, 2000G, 3000G, etc), this helps us in several regards. Also if you wish to reserve a large chunk please try to finish it in less than 2 months.

If you wish to simply try out the new dat on a 50G range etc your more than welcome to do so. However please reserve maineffort ranges through the normal channels and using the 991<n<50M dat instead.

The 991<n<50M dat will find all of the factors the current dat 1.6M<n<20M normally would, however it is about 15% slower but also finds factors above 20M and below the minimum threshold.

We have been finding many missed factors within the 1M<n<20M range based on the lower p-values we have already searched.
As one can see from Joe's post, so these efforts are far from useless.

Thanks and I welcome all the new faces, please post your comments/experiences here, good luck and happy sieving.

vjs

03-15-2005, 12:03 PM

Anyone else going to take us up on this offer we basically have everything less than 40T assigned now.

Stromkarl

03-16-2005, 07:41 AM

I would like to, but I am right in the middle of a range that won't be done for 30 days or so. Also, the company I work for is going to be phasing out all of the machines I am sieving with soon! They will be replaced with P4s with Win XP Pro. I know that sieving with a P4 used to be a waste of time. Is it still that way? If the SSE2 sieving client fixes that problem, I may continue to use 2-3 of these machines for sieving.

So, I will probably finish up the ranges I currently have reserved, then go either into PRP or P1 factoring. I can wrap up the uncompleted reserved ranges with one of the machines I have at home.

Stromkarl

Joe O

03-16-2005, 08:22 AM

Originally posted by Stromkarl
If the SSE2 sieving client fixes that problem,

I don't know if proth_sieve_sse2 "fixes" the problem, but it is faster than proth_sieve_cmov on two SSE2 capable machines that I have tried it on.

While the choice of what you run remains entirely up to you, I would like to urge you to strongly consider P-1 factoring, if sieving is inefficient on the machines available to you. Prime95 version 24.6 continues a long tradition and does a very good job of P-1 factoring of k*2^n+1.

The worktodo.ini entry
Pfactor=4847,2,2005647,1,43,3

gives us in results.txt
[Sun Oct 17 12:08:06 2004]
4847*2^2005647+1 completed P-1, B1=35000, B2=455000, WZ1: 33F53785

By the way, I would recommend changing the name results.txt to factors.txt or some other name to prevent conflict with SB's results.txt file. You do this by putting a line in prime.ini:
results.txt=your_filename

Mystwalker

03-16-2005, 11:00 AM

Originally posted by Joe O
I don't know if proth_sieve_sse2 "fixes" the problem, but it is faster than proth_sieve_cmov on two SSE2 capable machines that I have tried it on.

At least on Intel (I have no AMDs to try it out, but am pretty sure it holds here as well) SSE2-enabled CPUs, the SSE2 version is ~10% faster than the CMOV one.
When the FSB is clocked at 800 MHz, sieving speed is "not bad" - but not good either. :(

I'd also suggest P-1 factoring resp. normal PRPing. The P4s fly in these fields. :)

Silverfish

03-16-2005, 12:16 PM

I'm thinking about getting involved in the 0-50M sieving. I finish a range in the next couple of weeks. I have a few questions to ask.

1. I've estimated that I'll be taking about 2 months to do a 1T range, based on my current sieving rate, and reducing the speed by 15%. Would this be a problem?

2. I gather this sort of sieving finds some factors missed by the 0-20M range sieving, but how many are found? In particular, how does this compare with the number found through sieving 0-20M currently happening?

3. Would I just use prothsieve as usual, but with the different dat file?

4. How would I submit factors? Would I just e-mail you all the factors?

5. Do the range reservations happen on the yahoo site you mentioned?

Thanks in advance.

vjs

03-16-2005, 12:37 PM

I've tried sse2 on a p4 it is better than it was before... when you compare both sse2 and cmov on a athlon64 the speeds are almost identical.

The problem is not the client it's the processor P4's just don't sieve well it has something to do with the pipeline? Regardless, it's not a waste to sieve with a P4 compared to doing nothing but for SoB your really better off putting those into prp or factoring.

If it were me I'd basically put all of those machines on secondpass prp. The only two machines I don't have running sieve are on the garbage account. For work you might be best off simply starting sob as a service and forgetting about it. Factoring works and is probably the most benfital but it's manual like sieve.

vjs

03-16-2005, 01:08 PM

1. I've estimated that I'll be taking about 2 months to do a 1T range, based on my current sieving rate, and reducing the speed by 15%. Would this be a problem?

Yes the 15% estimate is still accurate and no 2 months is not really a problem at all. The major point is we are trying to progress quickly through these low T.

2. I gather this sort of sieving finds some factors missed by the 0-20M range sieving, but how many are found? In particular, how does this compare with the number found through sieving 0-20M currently happening?

Well this really depends on how many are missed. Currently we are very sure we are finding all factors that were missed b/c we have identified problems of why they were missed in the first place.

As for the number of factors found this depends on the range. Sometimes you get lucky... I havn't done the calculations recently but there have been ranges that produced more factors per g than first pass sieve. (perhaps Joe can give you exact numbers)

The other option of course is to sieve a high n range with the 991<n<50M dat this would be benifital to both projects, bascially your getting the high and low n work for a 15% overhead...

3. Would I just use prothsieve as usual, but with the different dat file?

Everything is the same except where you get the dat, the dat is downloadable through the group. I'll try posting it in this message but I think it's too large.

4. How would I submit factors? Would I just e-mail you all the factors?

Factors for n<20M are accepted through the website as usual once the range is finished zip fact.txt and factrange.txt and e-mail it to factrange@yahoo.com

5. Do the range reservations happen on the yahoo site you mentioned?

Yes they happen on the group, but I could assign you a 1T range if you wish.

vjs

03-16-2005, 01:10 PM

You could also get the dat from kroberts site it's an open site that doesn't require an invitation

SoB.dat_991-50M_20050310.zip

http://groups.yahoo.com/group/kerobertsdatfiles

Joe O

03-16-2005, 01:12 PM

Originally posted by Silverfish
I'm thinking about getting involved in the 0-50M sieving. I finish a range in the next couple of weeks. I have a few questions to ask.

1. I've estimated that I'll be taking about 2 months to do a 1T range, based on my current sieving rate, and reducing the speed by 15%. Would this be a problem?

2. I gather this sort of sieving finds some factors missed by the 0-20M range sieving, but how many are found? In particular, how does this compare with the number found through sieving 0-20M currently happening?

3. Would I just use prothsieve as usual, but with the different dat file?

4. How would I submit factors? Would I just e-mail you all the factors?

5. Do the range reservations happen on the yahoo site you mentioned?

Thanks in advance.

1) 2 months is not too bad. Submit every 250G or at 500G so we can monitor your progress.

2) How many are found depends on the range. Take a look at my graph a few posts up in this thread (on the previous page).

3) Yes you would use the same prothsieve, but with the 991-50M dat. See my post above for joining the Yahoo Group where we make it available.

4) Since your range would be above the 25T mark, you would submit your fact.txt in the normal way AND then email it with your factrange.txt to factrange at yahoo dot com.

5) Yes the range reservations for our high N/resieve effort happen on the yahoo site.
Having said that, there is another way that you could participate in our effort. Especially if you are leery about the 1T restriction we need to maintain. And/Or would like to help us but want to help the primary sieve more directly. Just reserve a normal range in the normal way. Use the 991-50M dat instead of the smaller dat.
Then you would be doing a normal range, just 15% slower, and you would be helping out the high n effort as well. When you are done, you would submit your fact.txt file in the normal way and then email it to us along with your factrange.txt file. There is no need for a second reservation, I would take care of that when your files arrive.

Edit: Looks like VJS types faster, or started before I did!<Grin>

vjs

03-16-2005, 03:42 PM

I know this is in no way represenative but here is an example of what I got with one machine in a 13250-13330 80G range.

13284778877917 | 33661*2^2607408+1
13295346336989 | 33661*2^2197440+1
13305730970641 | 10223*2^1859897+1
13306082217083 | 55459*2^2505010+1

This range basically represents 2 days of sieving so was sieving this range fruitful from an SoB standpoint?

Well considering these values would probably only have to be checked one more time. My machine could have probably tested all four of these values in the same time period, so from that respect I broke even with the 1<n<20M range.

However when you consider all of the factors, I also found these... Which would probably take me more than a year to test just once.

13251650260463 | 24737*2^27377551+1
13252049453837 | 27653*2^40721649+1
13252192917967 | 22699*2^24033070+1
13252910592211 | 4847*2^48794703+1
13254103813583 | 55459*2^44814838+1
13254486856979 | 33661*2^38090520+1
13254640159549 | 24737*2^28527247+1
13254820268743 | 55459*2^46572358+1
13255917395797 | 55459*2^46087678+1
13256116260889 | 55459*2^30425638+1
13256338185757 | 55459*2^36592906+1
13256844520849 | 33661*2^23860440+1
13257146221289 | 24737*2^30642271+1
13257231037247 | 21181*2^46307924+1
13258218262921 | 24737*2^25645327+1
13258949194753 | 10223*2^27549797+1
13259184690457 | 10223*2^34781309+1
13259695241417 | 10223*2^35824121+1
13259755514879 | 24737*2^23174623+1
13260444295883 | 24737*2^32306191+1
13260812435371 | 4847*2^35751951+1
13261877629229 | 24737*2^39750967+1
13262214929861 | 67607*2^41595251+1
13262677742837 | 55459*2^21335350+1
13263337295009 | 55459*2^34524526+1
13265189936983 | 55459*2^1054714+1
13265562421411 | 10223*2^49556381+1
13266086401091 | 33661*2^21343488+1
13266575081623 | 55459*2^49117306+1
13266748140143 | 33661*2^32073840+1
13267229836337 | 24737*2^22370527+1
13267272164687 | 22699*2^43698574+1
13267319079667 | 21181*2^43755644+1
13267492354693 | 55459*2^38286514+1
13269581418011 | 10223*2^29824421+1
13269716543203 | 55459*2^29377606+1
13269794672209 | 27653*2^44238669+1
13270312620319 | 55459*2^20144794+1
13270500700271 | 4847*2^48313167+1
13271596378907 | 4847*2^42220167+1
13272498734743 | 33661*2^34197120+1
13272514427747 | 55459*2^37885978+1
13273001862139 | 10223*2^47340317+1
13273710138943 | 10223*2^42931661+1
13273975237591 | 33661*2^27752688+1
13274339792837 | 4847*2^37747623+1
13274872765621 | 33661*2^22979736+1
13275059055857 | 10223*2^34075997+1
13275372902839 | 21181*2^22938548+1
13275724306337 | 55459*2^36805294+1
13278093708541 | 55459*2^43215778+1
13278424574677 | 10223*2^32145545+1
13278487071287 | 67607*2^33016451+1
13279838572487 | 55459*2^21893014+1
13280467785643 | 21181*2^40687004+1
13281094530307 | 55459*2^29702254+1
13281127943939 | 4847*2^38951823+1
13281172475171 | 27653*2^37507425+1
13281370012753 | 4847*2^33987591+1
13282464438037 | 10223*2^31969865+1
13282467971803 | 19249*2^26276378+1
13282534439567 | 19249*2^44383298+1
13282757645137 | 4847*2^46368327+1
13283053238129 | 21181*2^20682620+1
13283949991267 | 10223*2^48927785+1
13284073745443 | 10223*2^35746361+1
13284369250099 | 55459*2^34514554+1
13284504944849 | 24737*2^41493487+1
13284778877917 | 33661*2^2607408+1
13285767768001 | 24737*2^24854431+1
13285884303253 | 4847*2^28026351+1
13286169911687 | 24737*2^35010511+1
13286226204299 | 4847*2^30993471+1
13286458651909 | 55459*2^29674498+1
13286686151009 | 10223*2^42743789+1
13287747444859 | 21181*2^45831284+1
13289107917067 | 55459*2^32720086+1
13289356654967 | 24737*2^33786343+1
13289466209711 | 55459*2^40597606+1
13290105107599 | 10223*2^39877709+1
13290158833339 | 21181*2^35663468+1
13291254357341 | 4847*2^31783383+1
13291954709849 | 10223*2^30115517+1
13292661257129 | 10223*2^39346109+1
13292932922449 | 27653*2^25473777+1
13292933927561 | 10223*2^34087289+1
13293193263517 | 55459*2^49828378+1
13293262858231 | 22699*2^47267542+1
13293389624267 | 10223*2^44077445+1
13293593530237 | 24737*2^35308063+1
13293778047619 | 22699*2^30936070+1
13293958141447 | 10223*2^24820169+1
13294481796901 | 55459*2^47139106+1
13294499440373 | 55459*2^20878606+1
13294500293551 | 22699*2^45024958+1
13294670516821 | 10223*2^34634585+1
13295221098721 | 22699*2^98470+1
13295346336989 | 33661*2^2197440+1
13296135964913 | 33661*2^38509656+1
13296367169983 | 33661*2^46257936+1
13296588903139 | 22699*2^34197742+1
13297200998443 | 21181*2^30343892+1
13297904730013 | 24737*2^24845407+1
13298706293273 | 21181*2^33386204+1
13299400025057 | 67607*2^20991771+1
13299466163717 | 55459*2^34946878+1
13301959723243 | 27653*2^29625513+1
13301971166903 | 22699*2^27482518+1
13302140508539 | 55459*2^47281486+1
13302825011669 | 24737*2^39045343+1
13304533857397 | 24737*2^28967527+1
13305076882171 | 27653*2^37030569+1
13305081319171 | 19249*2^32612162+1
13305295082141 | 67607*2^49732091+1
13305730970641 | 10223*2^1859897+1
13306082217083 | 55459*2^2505010+1
13306502645941 | 10223*2^37651277+1
13306753652971 | 27653*2^29471649+1
13307680330249 | 55459*2^47613118+1
13307689376117 | 4847*2^31530543+1
13307898631487 | 19249*2^23148122+1
13308036340873 | 21181*2^24968852+1
13308421174879 | 27653*2^40506189+1
13308904690513 | 10223*2^37956281+1
13308991182427 | 55459*2^21469546+1
13309061485931 | 67607*2^27108387+1
13309382790929 | 33661*2^39151872+1
13310281410047 | 33661*2^24432408+1
13310843481167 | 55459*2^32954278+1
13311536894339 | 24737*2^39499231+1
13312083453727 | 4847*2^36056751+1
13312266173749 | 55459*2^30448390+1
13312292049223 | 4847*2^26065671+1
13312352531371 | 10223*2^33099737+1
13312980347489 | 24737*2^38186431+1
13313134894399 | 27653*2^613869+1
13313375947273 | 10223*2^20471549+1
13313688975997 | 24737*2^30883543+1
13314038587049 | 10223*2^35966681+1
13314978630451 | 10223*2^35744165+1
13315032984809 | 21181*2^41026148+1
13315342398703 | 24737*2^45671743+1
13315664764627 | 10223*2^26149181+1
13315754582243 | 21181*2^29375108+1
13316145371041 | 33661*2^44607384+1
13316244055313 | 21181*2^30275180+1
13316667483541 | 33661*2^43140384+1
13316704230413 | 55459*2^21283618+1
13316817011647 | 24737*2^24126631+1
13317258731341 | 10223*2^48690029+1
13317408590209 | 4847*2^26414871+1
13317653507083 | 24737*2^48500071+1
13317783761813 | 55459*2^20718778+1
13318342730111 | 10223*2^44016221+1
13318878059203 | 4847*2^45526863+1
13318908827327 | 10223*2^41948645+1
13319501791147 | 55459*2^26656354+1
13320194707247 | 10223*2^40738085+1
13320325101991 | 10223*2^28758077+1
13320544670393 | 22699*2^38291734+1

Silverfish

03-18-2005, 12:09 PM

I think I'll probably do a normal sieving range next, but with the 0-50M dat file. Thanks for the info vjs and Joe O.

vjs

03-21-2005, 06:14 PM

YOu can switch to the large dat at any point and time.

Joe and I are pretty good and determining which dat was used and where... of course you could make it easy for us and tell us exactly where you left off or started with each dat. :-) (round numbers are not required for main ranges anyways.)

What will probably end up happening eventually is everyone switching to a higher n dat. At that point Joe and I will do a little secretarial work and post the ranges completed on the new reservation thread.

Silverfish

03-22-2005, 12:31 PM

I've just started on the 0-50M dat, for the last 80G or so of my current range. I made a note of the last pmin figure in SobStatus dat file, before I changed to the 0-50M dat, and I'll send you that with the factors from the range when I finish it.

Another question though, are you interested in the factors in factexcl at all? I'll be submitting the 0-20M factors in the normal way, but what about those >20M? I suspect there won't be very many, but I just thought I should check. I 'll probably keep them anyway, so I can analyze the stats later on.

vjs

03-22-2005, 03:17 PM

Basically don't worry about factexcl.txt it doesn't contain any new factors eliminating k/n pairs from the dat.

Joe and myself have been keeping factexcl.txt thus far b/c it may contain some useful information for factoring purposes later but not likely. In any regards this file won't help SoB finish any faster.

What's most important is that you submit fact.txt and factrange.txt, all of fact.txt is useful and a little more than 10% of factrange is also usefull.

Keroberts1

03-22-2005, 10:31 PM

parhaps the 0-50M dat should be posted on mikes site for those interested but without access to the other sites

vjs

03-23-2005, 04:42 PM

This is probably a good idea but we would have to convince Mike. The file is ~2.5x larger than the regular dat not sure if bandwidth is an issue for him.

In the meantime if anyone wants the dat and doesn't want to jump through yahoo hoops and loops just PM me with your e-mail.

Joe O

03-24-2005, 12:02 PM

Help us to make it smaller, by joining our effort. In the meantime, this is the best that I can do. (http://groups.yahoo.com/group/kerobertsdatfiles/files/)

vjs

03-28-2005, 02:00 PM

I'd like to remind everyone about submitting factors to factrange@yahoo.com

For those of you using the 991<n<50M dat, the server will correctly record and assign credit for those factors n<20M however those above 20M are simply ignored. Currently no factor with n>20M is recorded by the server.

In order for these to be recorded please submit them to factrange@yahoo.com

This method is working well, since we don't require those factors immediately. Once you have entirely completed your range please mail the fact and factrange.txt files to factrange@yahoo.com.

The prefered format is as follows

Both files zipped together with the following name.

Sieve range 991-50M username.zip

Example,

810000-815000 991-50M VJS.zip

So far this has not been a problem but we are getting quite a few new users with the 991-50M dat and I just wanted to mention it again.

Of course we are still accepting factrange files as well for those of you processing with the 1.7M<n<20M dat. For these submittions please include "factrange" in the subject line.

On a sidenote,

You are encouraged to submit fact.txt to the server first or as you progress through your range. However intermediate submissions to factrange@yahoo.com are not required.

Thanks to all of those who are participating in the main effort with this dat. I'd also like to note that we are still finding some missed factors n<20M with resieving and the 991<n<50M dat at low p double check. From a sieve standpoint double check sieve is still worth the effort by far. We will inform users if this situation changes.

Expect to see more low-p missed factors appearing on Mikes pages in the near future.

A note on scoring...

Finally we are assigning ranges above 40T in doublecheck sieve. These missed factors will now score upwards of 10K points. This is significatly better than current second pass factor scores and especially better than the scores for those factors p<40T.

ShoeLace

03-30-2005, 07:50 PM

forgive me if this is a repeat but a general question for discussion...

since it appears that there is a growing number of seivers using a high n dat file (ie 1k-50M)...

would it not be more sensical to enhance the existing factor submission page (or create a new high-n page) to submit these factors too.?

I know this would involve some work on behave of the db/project admins whom are probably quite busy.

i few points/queries about this suggestion.

*) factors are stored/kept/received by those who ultimately want/need them.

*) I am NOT suggestion the automation of the seiving. just accepting factors n>20M

*) if the current data structure does not support n of this size, then it maybe even more effort. it would also involve greater storage space.

*) i do not know the method of creating/reducing the dat file by factors found, nor wether this feature wuold aid in that.

*) would this entail users receiving points on mikeH's pages?

I will now invite discussion/comment.

Shoe Lace

PS. sensical (adverb) sensible and/or practical

Joe O

03-31-2005, 06:48 AM

Originally posted by vjs

Thanks to all of those who are participating in the main effort with this dat. I'd also like to note that we are still finding some missed factors n<20M with resieving and the 991<n<50M dat at low p double check. From a sieve standpoint double check sieve is still worth the effort by far. We will inform users if this situation changes.

This is a graph of our most recent n <20M found factors for low p double check.

vjs

04-01-2005, 12:41 PM

ShoeLace,

Your comments basically describe the ultimate plan and I believe we are close to that point. However it will be the decision of Louie as to the acceptance of these factors...

Let me comment on some of your points as a whole:

First independant of the sieve range the user decides to sieve with the 991<n<50M dat, be it p=40T or p=800T (current sieve ranges) Mikes pages and the server will accept and score all factors p>25T (which we are now above with second pass) and n<20M.

This is good since we are still quite a ways away from testing n>20M but those found n<20M are automatically applied and score.

Your comment about server space, who gets the factors, etc... add to this; factor verification, testing for gaps in sieved ranges, co-ordination, and Louie and other being busy updating the dat etc.

This can be summed up in the efforts which have been done so far.

First a tremendous numbers of factors have been found thus far p<40T, as a matter of fact we have collected over 500Mb of factor files. You can see how this sheer number of factors would cause a problem for the server, automation, and updating of the dat in the past.

Joe has done a fantastic job updating the 991<n<50M dat we are currently using. I believe he said at one point it took his machine 4 hours to process one large factor submission of mine. You can see that such a submission to the server and 100% load for 4 hours would be a problem.

This is part of the reason Louie doesn't accept factors p<25T or n>20M and why they are not reported.

Another thing to consider is the number of factors found per G sieved decreases with increasing G. Hence, now that we are abouve 25T the number of new factors found drops off and so does the load on the server with importance of updating the dat. (991<n<50M dat has decreased in size from >27Mb to <8Mb).

A 8Mb dat isn't really an issue anymore, consumes <32Mb of machine memory and its size is not that difficult to work with or transfer.

This was the point we are tring to get too (25T) perhaps it is still too soon but trying to do this effort through the server for p<25T would have been a nightmare.

Regardless now may be time for the project to consider accepting factors n>20M, or perhaps we should continue as is until 75T... it's louies descission as to if and when.

Current there isn't much of a problem continuing as is, Sieve and submit factors as per usual. Then once their range is entirely finished the send their factors in via e-mail. We can then check for gaps etc above and below 20M, it avoids ranges being divided up into 20G chucks and makes it very easy for archiving purposes.

As for points scored etc, even if those factors scored at present the points awarded would be very low until the factors enter the main window. So for scoring it's not important. In the past the users who were participating were not as interested in scoring personally it was more of a project goal. Stats regarding factors found and progress etc are reported in the yahoo gourp and here from time to time, they are just not updated every 4 hours like Mikes.

As for updating the dat....

Joe runs his own db server that handels the dat and applies factors, currently it's working just fine. From time to time he grabs what was sent to the server through the webpage, e-mailed to factrange and processes all the factors. Joe would have to comment regarding it being easier to get everything from the server I'm not sure.

For stats updates, I'll probably put together another stats update next week. Also eventually we will produce stats on a user basis which is why we like people to reserve in 1T increments for double-check-sieve.

More on the 991<n<50M sieve stats later...

Keroberts1

04-01-2005, 05:30 PM

I've been trying ot do some DC sieving withthe large dat but i haven't been able to get to the .dat file. I have problems with yahoo and i would like it if there was a link in the fourum. Could someone help me with that?

vjs

04-01-2005, 10:22 PM

I'll e-mail you the dat to your aol account tomorrow, let me know if you get it.

Joe O

04-04-2005, 04:41 PM

Since this is the thread for n > 20M sieving, I thought that you might want to see the graph of our progress.

vjs

04-08-2005, 02:27 PM

For those of you following and contributing to; the high-n sieve, double-check sieve, and sending factrange.txt to factrange@yahoo.com, here is the final graph of this type.

This graph represent the decrease in dat size or number of k/n remaining vs update.

This final point (9) in this graph represents the most recent reduction which includes...
- all factrange@yahoo.com submissions
- all ranges p<25T have been completly sieved with a 991<n<50M dat
- all p<3T with a 50M<n<100M
- those factors submitted p>25T with the 991<n<50M
- the low-n P-1 factoring effort
- all factors found thus far by the main effort for n<20M

Whew!!! Quite a few factors and sources...

We have reduced the total number of k/n pairs between 20M<n<50M from just shy of 2M pairs (1,946,998) by over 55% to 871,348 k/n pairs remaining.

This leaves roughly
29,045 k/n pairs per 1M range of n.

(For comparison purposes) 1M-20M has roughly
26,264 k/n pairs per 1M range of n.

Continued in next message, see graph.

Y-axis represents total number of k/n's remaining without factors
X-axis represents the update interval.

vjs

04-08-2005, 02:37 PM

For comparison purposes in the future... The following graph only represents the number of k/n pairs (either reduction etc) for 20M<n<50M only.

Note that there is one less update, update #8 in this graph is the final update (Same as final above) and will serve as the start of all new graphs.

This point can be loosely refered to as the p=25T sieve point in the future.

Note:
Y-axis represents number of k/n pairs remaining
X-axis represents update interval

(Update 2-3 in previous graph does not apply to this graph)

vjs

04-08-2005, 03:31 PM

Final table representing previous update intervals.

Future tables may have different limits on lower n> / Upper n< to represent current prp levels of first-pass and second-pass.

All future table will begin with the April-5 (p=25T) update inverval. (Original will change to p=25T)

Notes:

The dat size in bold (Lower LHS of table) represents the current 20M<n<50M range and 991<n<50M dat.

Points of interest for the most recent update shown in bold and noted as "found April-5"...

Notice the relatively large number of n<1M eliminated, many of these were from the low-p factoring efforts. The 20M<n<30M range "factors found" is slightly greater then the 30M<n<40M or 40M<n<50M, this difference or roughly 60 factors is probably due to factrange@yahoo.com submissions.

Note the reduction in 10M ranges 50-60, 60-70, 80-90, 90-100... These unique factors were found solely through factrange.txt submissions from people using the 991<n<50M dat. It seems as though all dat are able to find a great deal of unique factors just above their upper limit.

1.7M<n<20M main-effort dat finds factors into the low 40M range. Where as the 991<n<50m dat finds alot of factors 50M-100M and above... yes we are keeping n>100M as well.

vjs

04-08-2005, 03:50 PM

I would also like to make one final comment...

Joe has produce a new dat (drum-roll) .... the dat is now less then 8Mb uncompressed!!!!

:elephant: :elephant: :drums: :bouncy: :cheers:

Memory consumption for this dat is now around 31Mb.

The new dat is availabe throught the high-n sieve group, perhaps if we ask Mike nicely he will host this dat as well... (~1.8MB compressed)

Mystwalker

04-19-2005, 08:26 AM

As all n's =< 25,000 have been tested (http://www.free-dc.org/forum/showthread.php?s=&threadid=8571&perpage=35&pagenumber=2) for at least 20 digits, maybe a new sieve file should start from 25,000 and not from 991?

The chances that a factor for that tests is found is next to zero...

vjs

04-21-2005, 11:31 AM

Mystwalker,

Thanks for point this out... there are two things to consider here.

First how much will this speed up the client?

Bringing the lowest n up from 991 to 25,000 will speed up the client, but not by much.

speed increase = [100 x (25000-991)]/50000000 = 0.05%

Second the dat in addition to being the (paramemter file) to run sieve is also an archieve of all unfactored k/n pairs n<50M. So if one decides to factor these small numbers they always know which k/n pairs are left by downloading the latest dat.

So I don't think the speed increase is worth the sacrifice of maintaining two files, dats, or a seperate k/n archieve, etc.

Your thoughts?

Mystwalker

04-21-2005, 12:46 PM

I think it's just a design decision without "right/good" or "wrong/bad".

There definitely is next to no performance increase. And the use of the sieve.dat as a factorization repository has its merit, no doubt.
It's just my addiction to optimization, I guess. ;)

vjs

04-21-2005, 12:59 PM

addiction to optimization

I'm with you one that one, eventually we will change the 991<n<50M dat to a different size that's for sure. It depends on alot of factors :rotfl: (wow sometimes I kill myself); how many missed factors we find through the resieve, if we eliminate the 1.8M<n<20M dat and go to stickly n<=50M, when we find the next prime, if we find missed prime, the error rate etc...

But until then I'd like to get as many people to use the 991<n<50M dat as possible followed by prp-secondpass. Thanks for commenting on and following this thread Mystwalker it's great to have your input.

vjs

04-26-2005, 05:46 PM

The most recent version of the 991<n<50M dat is available here...

http://www.teamprimerib.com/vjs/

There is no logon etc, hopefully this will make things easier.

The proth programs are also mirrored here since Klassons site is not resolving today.

royanee

04-26-2005, 09:08 PM

There is an updated link to Klasson's site:
http://mklasson.com/proth_sieve.php
which resolves to:
http://85.8.4.99/proth_sieve.php
I don't think the university one works anymore...

Joe O

05-03-2005, 11:20 PM

Keep those factrange.txt (and fact.txt for the 991<n<50M) files coming. Here is a picture of the results: (Updated to reflect the new dat)

Joe O

05-04-2005, 02:39 PM

O yes, we are active for p <100T as well: (Updated to reflect the new dat)

vjs

05-04-2005, 02:46 PM

Joe,

correct me if I'm wrong here but those green and light blues below 20M are missed factors correct?

If so very cool and nice chart.

Joe O

05-04-2005, 02:53 PM

Originally posted by vjs
Joe,

correct me if I'm wrong here but those green and light blues below 20M are missed factors correct?

If so very cool and nice chart.

VJS,
They *were* missed factors, but we found them! <G>

Nuri

05-04-2005, 07:03 PM

joe, do you have anything in table format?

vjs

05-04-2005, 07:36 PM

Nuri,

What were you looking for exactly I may have it...

Nuri

05-05-2005, 07:28 AM

This is just to see the number of missed factors through main sieve effort (hopefully not much).

What I have in mind is something like the one below. Please feel free to change anything that would make it easier for you to create the table.

Nuri

05-05-2005, 07:32 AM

Ooops, 50T<p column would distort the data. There's no need for that.

vjs

05-05-2005, 01:29 PM

Hey Nuri, I think I have just the thing to make that work. Joe sent me an out with

n vs p vs # factors found.... I'm actually going to cut this file at 25T however and be warned, we are not entirely sure how accurate it is, let me explain.

What the table will represent is those factors we found for p<25T and n<20M, however the question remains were these factors actually missed? Well, the k/n pair existed in the dat this is true so one would think it eliminated a test. But the dat is formed and updated from the results.txt output file which has been cropped at p>25T for size purposes. So the question remains did the test exist in the server que???

In some case we have a very strong feeling that the test did exist in the que b/c the old client versions were missing certain ?? j ?? or covering set (Joe could explain more). Or the people only sieved one range and dissappeared.

Example...

12000-12100 Nuri [complete]
12100-12150 Mr. newbie OneTimer [complete]
12150-12200 Louie [complete]

And we find a pile of factors between 12148-12150

But in other instances people who you wouldn't expect to miss a factor have?

_____________

Regardless, we are still finding missed factors above 25T, where we definetly know they are missed factors. These factors found above 25T we can say 100%, hey they were missed and yes they are new. Now we can try to come up with reasons for them (users, clients, server, computer, etc) then predict patterns and gaps, we may offer these ranges to people in the future... I'm not sure if there is that much interest in these types of special projects.

Joe, Mike and I have been doing this factor prediction with quite a bit of sucess a some failures. Regardless when we repeat the range we always find factors from 20M-50M using the 991<n<50M dat.

O.K. let me get to he table.

vjs

05-05-2005, 03:37 PM

ARRGHH!!!

I had a huge write up and lost it on submission... sorry.

I'll have to come back to it later perhaps others can spot the trends and we can discuss. Can someone explain the table as well. I seriously ran over on message size.

IronBits

05-05-2005, 03:41 PM

Originally posted by vjs
I had a huge write up and lost it on submission... sorry. Notepad is your friend when doing more than a quick reply ;)
Sorry you lost it all! :cry:

Ya think that TABLE is large enough to read at about 100 yards :rotfl:

vjs

05-05-2005, 03:56 PM

Yes, I generally crtl-a crtl-c before I submit which I did as well but for some reason it didn't copy? Also you can't preview pictures... Free-DC fourms are great and stable. My fault ... the picture was 3MB in size :sofa:

Regardless here is an excel version.

Points to watch for

n<3M, n<1M, the 300K<n<3M and 3M<n<20M dats, 7T-8T, 14T-15T, 19T-20T.

I didn't think there were any mistakes in the data but my day is rough, thanks Ironbits.

IronBits

05-05-2005, 05:19 PM

Look ok? Send me anything you want and I'll size it anyway you want - files that is. ;)

vjs

05-10-2005, 04:05 PM

Well I looked through the lastest stats some gap analysis that Joe sent me and there are two "high possibility" missed factor ranges <75T.

Is anyone interested in doing a T or two that encompasses either of these gaps?

Nuri

05-10-2005, 05:09 PM

what are their sizes?

IronBits

05-10-2005, 06:03 PM

Originally posted by vjs
Is anyone interested in doing a T or two that encompasses either of these gaps? Yes, if it will help you guys out...
24/7/365 crunching going on over here ;)

vjs

05-10-2005, 06:41 PM

Originally posted by Nuri
what are their sizes?

It doesn't really work that way, the size of the acutal gap is quite small but where is it exactly? (this would be too easy).

It's the matter of resolution to the method that's not great.

Let me give you an example, a gap was predicted around a low 33T-34T (can't remember exactly.

While a sieve of 32-33T and came up with the following missed factors at particular n-level,

n-range (#missed factors)
1M-2M 1
2M-3M 5
3M-4M 0
4M-5M 2
5M-6M 1
6M-20M 0

9 total missed in 1T. Not great I guess, but 31-32T had one missed factor and 33-34T had 2.

Another gap was predicted around between 38T-40T

So far I we found 32 factors between 38-40T and 1M<n<20M, at this rate its almost better than first pass sieve.

I'll make another table like the last one once we have everything less than 40T done.

--------------------------

What I offered before was a test the prediction model, I can't say 100% that there will be any missed factors but it's more than likely. For the near term it's probably worth pecking at some ranges between 50T-100T as opposed to continuing from 44T up (at least from the standpoint of missed factors). For this and administative purposes 1T chunks at a time are probably best.

Nuri

05-11-2005, 05:34 PM

I see, thx..

vjs

05-12-2005, 02:21 PM

Here's an example of a "High probablility missed range"

99.929T 19249 19579982

It would have taken weeks to test this pair twice, found it in two days with sieve probably more to come.

Not to mention all of the n>20M stuff.

Re-doing the low ranges now is really only benifital if we find a few of these from time to time.

Probably wound't have found it using p-1 and there was obviously no other factor <45T or no other one that we found between 45T to ~800T.

99 928888 301542 = 2 x 3 x 43 x 387231 272099

jasong

05-17-2005, 05:08 PM

Originally posted by vjs
Regardless, we are still finding missed factors above 25T, where we definetly know they are missed factors. These factors found above 25T we can say 100%, hey they were missed and yes they are new. Now we can try to come up with reasons for them (users, clients, server, computer, etc) then predict patterns and gaps, we may offer these ranges to people in the future... I'm not sure if there is that much interest in these types of special projects.

Joe, Mike and I have been doing this factor prediction with quite a bit of sucess a some failures. Regardless when we repeat the range we always find factors from 20M-50M using the 991<n<50M dat. I've heard a little about some accounts, called "garbage" and "secret" I think, that I'd be willing to try if I had any instructions.

I don't care much about points, but I LOVE to crunch. If you have anything unusual that doesn't require programming or compiling and has instructions that can be followed by your average college student, I'm game to do it. I won a computer in a drawing by my team, Free-DC, so I'll have some extra crunching power in a week or two.

vjs

05-17-2005, 05:58 PM

The secret and supersecret accounts are old accounts that have been replaced by QQQsecondpass.

So if you wanted to run secret or supersecret the best bet is use your usernameQQQsecondpass. Both of those account basically ran secondpass anyways.

What you have to do is edit the registry look at the lost test thread second down main forum. In the registry you should see your user name jasong

edit it to jasongQQQsecondpass.

Now garbage is another matter, here you just log-on (change your username with the main client) to garabge. I'm not sure if this account is really doing anything useful currently. It may be double or even tripple checking. Until that's straightned out or explained fully then I'd suggest not running the garbage account at all.

There is another special account holepatch, it's currently running lost tests or something strange there isn't many tests there about 1300 and they are all between 1.8M and 3M.

I'm doing a few using the username holepatch just to finish off that que.

In all reality it's probably best to run the QQQsecondpass IMHO.

hhh

05-20-2005, 09:22 AM

I have got a stupid question. Soon, the so called 'high range sieving' (the range isn't that high, anymore) will have dropped the factor density sufficiently to change the server settings to accept factors for n>20 000 000. My belief is that one day, this will happen.
Is somebody considering stopping the 'normal' sieving and swiching to 'high range sieving' completely? At least when the server accepts all factors? Or are there important reasons to continue with the dat file used until now?
The only reason why not everybody is using the new large dat seems to me to be the factor acception thing.
Correct me if I am wrong, please. H.

vjs

05-20-2005, 12:09 PM

hhh,

Your correct, I think the dat would be used more if the server would accept, but ultimately it will be Louies call about accepting the factors. Alot of "older" sievers are using 991<n<50M now for main effort ranges. All those factors n<20M are accepted through the server and it's not difficult to zip and mail those above 20M.

I'd like to hear from others about using 991<n<50M. Also if we would completely switch it might be to a 1.9M<n<50M dat, at this time it's hard to say. There is some interest in those low-n's

Joe O

06-07-2005, 03:04 PM

I've started processing the fact.txt, factrange.txt, and factexcl.txt, files submitted since the last DAT was created. The following graph shows what is new since the last time. If anyone has factrange.txt files from the 20M effort, please send them to factrange at yahoo dot com. Those of you using the 991<n<50M dat please submit your fact.txt and factexcl.txt files as well as your factrange.txt files. Thank you.

vjs

06-07-2005, 03:14 PM

Joe I'll have 40-44T done in about a week with 991<n<50M.

<edit> pushing that range a little hard trying to send it by the weekend. I'll also send out the partials from my higher ranges.

Keroberts1

06-07-2005, 11:21 PM

my range should be completed within 2 days.

Joe O

06-08-2005, 12:11 PM

Just to complete the picture from my previous post, here is the primary sieve range. Note the places where people used the 991<n<50M dat. 250% more pairs sieved for 10% more effort.

Matt

06-08-2005, 06:29 PM

I've switched to this larger DAT file and speed has decreased from ~550kps to ~430kps :(

vjs

06-08-2005, 07:25 PM

That's a little higher than expected Matt generally people see 10-20% decrease with the average around 15% in speed, I think Joe is being over optimistic. :Pokes:

You have to remember however that you are actually sieving a range that is over 250% larger. So your increasing overall production by 220%.

vjs

06-13-2005, 02:25 PM

Lower Upper k/n's k/n's Current Update Update Update
(n>) (n<) 25T Found Dat 12-May 6-Jun 44T+
991 2M 53303 244 52944 155 89 115
2M 9M 183438 598 182840 316 64 218
9M 10M 26223 168 26055 66 3 99
10M 20M 262951 746 262205 391 66 289
20M 30M 289470 6285 283185 2866 1846 1573
30M 40M 290320 6244 284076 2778 1877 1589
40M 50M 291558 6323 285235 2923 1838 1562
50M 100M 1554759 2370 1552389 967 778 625
Sum 2952022 23093 2928929 10462 6561 6070

Lower Upper k/n's k/n's Current Update Update Update
(n>) (n<) 25T Found Dat 12-May 6-Jun 44T+
991 2M 53303 244 52944 155 89 115
dat % - 0.46 99.33 0.29 0.17 0.22
2M 20M 472612 1512 471100 773 133 606
dat % - 0.32 99.68 0.16 0.03 0.13
991 50M 1397263 20608 1376540 9495 5783 5445
dat % - 1.47 98.52 0.68 0.41 0.39
20M 50M 871348 18852 852496 8567 5561 4724
dat % - 2.16 97.84 0.98 0.64 0.54
50M 100M 1554759 2370 1552389 967 778 625
dat % - 0.15 99.85 0.06 0.05 0.04
991 100M 2952022 22978 2928929 10462 6561 6070
dat % - 0.78 99.22 0.35 0.22 0.21

vjs

06-13-2005, 02:35 PM

Well Joe just sent me some stats from the latest dat run, I'll be processing these thoughout the day.

These stats above are a continuation from the 25T stats update earlier. Currently all p<44T have been sieved with the 991<n<50M dat file. Since the all p<25T update we have managed to eliminate another 18852 k/n pairs between 20M<n<50M reducing the total number of tests by an additional 2.16%. With this 25T<n<44T p range interestingly enough we have also eliminated 0.2% of the tests above 50M using factrange.txt.

Next, I'll work on tabulating the missed factors n<20M, doublecheck portion of this sieve effort.

Note that update May-12 and Update 6-Jun were non-public updates but I have included them regardless.

vjs

06-13-2005, 04:02 PM

O.K. here is the table of factors missed by previous sieve efforts later found using the 991<n<50M between 25T-44T.

Our double check sieve effort removed a total of 314 factors with n<20M from 25T<p<44T.

This can be broken down between firstpass and secondpass tests, other factors found are below the double check effort.

163 secondpass tests
71 firstpass tests

Looking at the stats it certainly looks like the effort was worth it for the 25-44T range.

maddog1

06-13-2005, 06:06 PM

Interesting update VJS...
I was just wondering, is there a specific known reason for the large number of missed factors @38 & 43T?
I mean, was it a faulty PC, a careless user who forgot to submit all finds or something else I'm missing?
In any case, good effort for your team...congrats :)

vjs

06-13-2005, 07:11 PM

Several reasons for the missed factors and why there are so many ...
Looking at the table there are quite a few things to consider.

First those factors less than 300K were never looked for in the first place so those 0M-1M are artifically high (not a double check etc.) Second there are some problems regarding the 300K<n<3M and 3M<n<20M dat and range resevations at the time. It seems like some people were using the 300K<n<20M were they should have been using 1M<n<20M. So some regions were never sieved for n<3M or 3M<n<20M.

Missing the factors for n<3M are currently not that critical but missing n=10M-20M factors, WOW!!!

This is part of the reason why we are redoing the sieve effort using a complete 991<n<50M, we will get all of them all including those 20M<n<50M.

Second when one of these 3M<n<20M ranges or 300K<n<20M ranges are found or someone didn't submit 20G worth of factors etc. The shear number of factors we find is very high. This is due to the factor denisity at low p.

When we were sieving around 45T we were getting something like 400 factors per 1000G sieved, now we get roughly 15 factors per 1000G sieved (in the 1M-20M range). So missing a 10-20G sieve range didn't seem like much then, only 4-8 factors, but that translates into 250-500G of todays sieve.

Of course there were other problems early on where proth or the sieve client at the time just missed factors. I believe we are beyond this point now (p=44T) but several ranges above 44T were sieved and we found factors there as well. Why I don't know but we are still finding some.

I think what's important is to continue this low-p high-n effort until the number of missed factors decreases to a point of being inefficient. I also encourage people to sieve main ranges (i.e. the current effort) with the 991<n<50M dat.

I did a very rough calculation...

I would have taken me a little more than a year to sieve this entire range by myself ~380days. However to test all of these missed k/n pairs would have taken me in excess of 3 years, that's a pretty vaild arguement for the effort in my eyes.

vjs

06-13-2005, 07:12 PM

Specifically 38 and 43T they were larger holes Joe will probably post a graph soon.

vjs

06-13-2005, 07:15 PM

O.K. posting Joe's work on his behalf...

You can also see some of the other ranges we tested above 44T...

- Black dots what was found before and factrange.txt submissions for the 1M<n<20M dat.
- blue dots new factors found using the 991<n<50M dat (and missed factors n<20M)
- green dots, Ironbits work in progress he hasn't submitted or we havn't processed those n>20M factors yet.

Now if you look at the blue strips:

The blue strips 20M<n<50M are ranges we tested with 991<n<50M. If there are blue dots below 20M these were previously missed factors. So you can see that Joe actualy "hit a jackpot" around 99T.

vjs

06-13-2005, 07:28 PM

Here is a quick missed factors post for Joes graph 1M<n<20M ...

Lower n 1M 2M 5M 6M 9M 11M 14M 15M 16M 17M 19M
Upper n 2M 3M 6M 7M 10M 12M 15M 16M 17M 18M 20M
49T 3 2
50T 6 5 1 1 1 1 2 2
78T 2
89T 2 1
99T 1 1 1 1 1 1

hhh

06-16-2005, 10:11 PM

I think the next aim should be to get to 88T as fast as possible, in order to increase sieving speed, too.

vjs, can you assign me a range of 1T near 80T, please?

Thank you in advance, Yours H.

Joe O

06-17-2005, 07:00 AM

Originally posted by hhh
I think the next aim should be to get to 88T as fast as possible, in order to increase sieving speed, too.

vjs, can you assign me a range of 1T near 80T, please?

Thank you in advance, Yours H.

79000-80000 HHH

Is that near enough?

It's not VJS but that Other guy. Will that do?

vjs

06-18-2005, 03:32 PM

HHH,

Just to let you know alot of the other ranges less than 88T are being worked on as well. Ironbits just took a chunk under your own. We will see what happens with this slightly higher T. There are a few places where no factors <3M were reported, and maybe some vise versa, can you say 300K<n<3M dat and 3M<n<20M dat confusion?

This is part of the reason why we are sticking with 991<n<50M.

If I'm nice to Joe he'll run another stats update once we get everything to 50T done. I'm the only one with one range less than 50T outstanding, we will also see what effect the 9k had etc.

jandersonlee

07-05-2005, 12:33 PM

I've got four machines running on 991<n<50M. Currently I can bang out 1T in about 2 weeks (assuming none of the machines crashes that is). Right now I'm doing high range (862T-863T) sieving for the main effort. If it would help at all I'd be happy to switch to second-pass sieving on lower T ranges after the lastest range finishes (around July 18).

vjs

07-05-2005, 01:06 PM

Jandersonlee,

It's your call on the low-p high-n or high-p high-n. Sieving both is useful but we havn't been finding alot of missed factors lately with the low-p. THe major push right now is to basically just finish everything less than 100T. It's not a rush as long as we can get them done.

If you'd like to try a 1T range I can assign one to you. If you'd like to continue with the main effort using 991-50M from 862000 down this is great as well.

Thanks for the help and let me know if you want a range of low-p.

hhh

07-06-2005, 01:34 AM

I guess you are checking every number sent to factrange@yahoo.com if it is really a factor, like does the submission script, but are you actually checking for factors found the first time, but missed the second time, too? This might help discovering hardware or software problems.
H.

Joe O

07-06-2005, 01:42 AM

Originally posted by hhh
I guess you are checking every number sent to factrange@yahoo.com if it is really a factor, like does the submission script, but are you actually checking for factors found the first time, but missed the second time, too? This might help discovering hardware or software problems.
H.

Actually yes. That's one of the reasons why we are asking for factexcl.txt files.

vjs

07-18-2005, 04:44 PM

O.K. Guys and Gals!!!

High Time for a High-n_low-p Update.

Joe, Just ran another dat update on the 991-50M dat, we are doing great. Looking at the data a couple things stand out.

- We almost eliminated an additional 1% of the tests from 20<n<50M with this update :thumbs: . This is really saying something since we are now at higher T than before.

I'm always curious about how many missed factors we have found.
- 18 missed factors were found!!!

18 when I only count those which eliminate tests. I always try to be careful when counting these they are for 2M<n<20M and only for those k which have no prime.

We found quite a few missed ones n<2M, but since secondpass is beyond that level it's hardly fair to count them.

Look for more data tommorrow, also a new dat will be working it's way through the system. But for now here are the first couple pics.

vjs

07-18-2005, 04:51 PM

O.K. one more graph this one is a little difficult to explain. The axis have changed slightly...

The 100% sieve level for the 991-50M dat is currently sitting around 53T, however a large number of ranges above 53T and less than 100T have been totally sieved.

The y-axis has not changed and it still represents the total number of k/n pairs in the 991-50M dat. The x-axis represent the number of T sieve less than 100T. I couldn't think of a better variable this one to show our progress, it seems to work well.

From the graph it's apparent that we had a fairly large decrease. I'd also like to point out k=27653 has been removed from all data past and present. In other words there is no effect from the prime in these results.

Joe O

07-18-2005, 07:23 PM

25T<p<100T

Joe O

07-18-2005, 07:25 PM

100T<p<1000T

Nuri

08-16-2005, 12:10 PM

Time for a monthly update??:bouncy:

vjs

08-16-2005, 01:00 PM

Hey Nuri,

Your probably right, but lets give it another week or two we will have a little more by then.

ShoeLace

08-17-2005, 04:04 AM

don't we always have a little moer every week?

Joe O

08-17-2005, 08:43 AM

Originally posted by ShoeLace
don't we always have a little moer every week?
Yes, we even have a little more every day. It's hard to balance the work of producing a dat vs the work saved by using a smaller dat. There is also the fact that many people do not update their dat that often. The 1% or 2% savings do not warrant stopping and starting the sieve programs. It's just easier to keep the nextrange.txt file full with next ranges to sieve, and possibly more productive.
Now if you are talking about pictures, that is a different story. Those are easy to produce. Now having said that here they are. First the 50T-100T range:

Joe O

08-17-2005, 08:46 AM

100T<p<1000T

vjs

08-17-2005, 10:03 AM

Thanks for doing the pictures Joe,

Shoelace, yes we do have a little more every week. But it's more an issue of users submitting the data and what data has been submitted. There are only two people working below 100T afaik, Hades_Au and Stromkarl. Both of these guys are trying to finish off the gap around 55T.

I was thinking it would be nice to do another stats run once they hand in those ranges. I'd like to tabulate an everything that has happened at some point.

The greater than 500T stuff is best represent graphically as Joe_O has shown.

Nuri

08-17-2005, 04:23 PM

vjs/JoeO

can you please give a list of unassigned ranges below 100T. This is just to get a feeling of where we are.

Nuri

08-17-2005, 04:25 PM

And as far as the request is concerned, yes, I was talking about the pictures, and thx for the quick reply.

In my opinion, an update after each 100T is finished would be enough for the dat file.

Joe O

08-17-2005, 07:11 PM

Originally posted by Nuri
vjs/JoeO

can you please give a list of unassigned ranges below 100T. This is just to get a feeling of where we are.

70000 74000 991<n<50M <------------ Available ------------->
81000 85000 991<n<50M <------------ Available ------------->

Or, you can see for yourself here. (http://groups.yahoo.com/group/SierpinskiSieve/database?method=reportRows&tbl=1)

jandersonlee

08-17-2005, 09:55 PM

Hmm. Have some idle machines at the moment.

70000-71000 jandersonlee [reserved] (w/ 991<n<50M)

Or is there somewhere else to record this?

Jeff

Joe O

08-18-2005, 08:45 AM

Originally posted by jandersonlee
Hmm. Have some idle machines at the moment.

70000-71000 jandersonlee [reserved] (w/ 991<n<50M)

Or is there somewhere else to record this?

Jeff

I'll record it for you. But yes there is a place to make reservations. The best way is to email factrange at yahoo dot com with for example
70000-71000 jandersonlee [reserved] (w/ 991<n<50M) in both the subject and the body of the email. But any old email will do.

jasong

08-18-2005, 09:58 PM

I'm not sure if I want to help with the low-p sieving or not. I have 2 computers, 256K cache on each, 1.25GHz AMD something and a 1.75GHz Sempron.

If anybody knows what these two babies can do in two weeks time, check to see if the number is more than 1 trillion. If it is, go ahead and assign me something.

Alternately, tell me a range that has been sieved, and I'll tweak the necessary .dat file so it'll run 200 million and give me an accurate measure in Sobistrator.(You know what? Never mind. I'm going to go ahead and do the tweaking. brb)

Edit: Total sieving speed of 625kp/s, about 2wks4ds for a trillion. A little too much, but if you want to break the rule about 1T chunks, I'd be willing to tackle .5 or .75 trillion.

vjs

09-19-2005, 11:34 AM

Well I finished up my 865000-870000 range over the weekend here are some stats for you guys. Out of the 5T range I found 109 factors <20M and 247 factors between 991-50M.

Basically the factor density over this range was....

20 factors / 1T with n<20M
49 factors / 1T with 991<n<50M

Still pretty good IMHO, I'll let you guys and gals know what happens with the 1120T.

MikeH

10-08-2005, 06:15 PM

Looks like the factor submission form http://www.seventeenorbust.com/sieve could now be accepting some n values above 20M. Not sure if it's intentional, but a few I was alerted to a few being in results.txt

n=22418391
n=23423165
n=31287881
n=20134244
n=28359068

then I submitted this one that I had lying around

Factors
899671439917729|4847*2^26696511+1

Verification Results
899671439917729 4847 26696511 verified.

Factor table setup returned 1
Test table setup returned 1

1 of 1 verified in 0.11 secs.
1 of the results were new results and saved to the database.

maddog1

10-08-2005, 08:58 PM

This is what the factor submit page says right now:

For those interested, here are the constraints on what the verifier will accept:
k = one of the 9 left
1000 < n < 50000000
1000000000000 < p < 2^64

Appears someone worked on it :)
I'll have to try and submit my old factrange.txt contents, I still have all of them around.

vjs

10-10-2005, 12:58 PM

I just wanted to point out the status of the 991<n<50M dat double check...

All ranges 0-72000 have been completed using the 991<n<50M.

0 72000 Completed

Here is what's going on between 72T and 100T as you can see we are well on our way to finishing everything less than 100T.

72000 74000 Reserved Stromkarl
74000 78000 Completed Ironbits
78000 79000 Completed Joe O
79000 80000 Completed HHH
80000 81000 Completed Hades_Au
81000 85000 Available <---- lowest reservation point
85000 88000 On Hold Or combined Effort
88000 89000 Reserved Joe_o
89000 90000 Completed Joe O
90000 91000 Combined Effort Reserved HHH
91000 95000 On hold Or Combined effort
95000 97000 Completed e
97000 100000 Complete VJS

I'd also like to add that severall smaller ranges 100T<p<500T have been completed by Joe_O checking holes etc.

pixl97

10-10-2005, 04:26 PM

Notice! the sieve posting page may be broken, if you enter a factor above 32 million (probably a unsigned 32bit interger) it does not verify the factor, and it appears not to verify factors entered after it on the same page! Will an admin please check if thats the case, some people may be losing factors to this.

920902464340249 | 4847*2^27525327+1 : Works
920526825028453 | 21181*2^28359068+1 : Works
920754397985377 | 21181*2^33801212+1 : Does not work

vjs

10-11-2005, 10:09 AM

It's only recently that the server accepts any factors n>20M, those >20M must the e-mailed to factrange@yahoo.com once the entire range is finished.

royanee

10-12-2005, 12:17 PM

Originally posted by pixl97
Notice! the sieve posting page may be broken, if you enter a factor above 32 million (probably a unsigned 32bit interger) it does not verify the factor, and it appears not to verify factors entered after it on the same page! Will an admin please check if thats the case, some people may be losing factors to this.

920902464340249 | 4847*2^27525327+1 : Works
920526825028453 | 21181*2^28359068+1 : Works
920754397985377 | 21181*2^33801212+1 : Does not work

32 million is kind of a random number for it to break on. A 32bit signed integer would have problems at 2.1 billion, whereas unsigned 32bit would mess up at 4.2 billion. 32 million is roughly 25 bits... doesn't ring any bells for me...

vjs

10-12-2005, 12:38 PM

It actually rings a bell.

Somewhere around that range is the minimum n for the EFF prize (money) winning prime. So in other words the limit may be a intentional program limit created by the coder etc...

pixl97

10-12-2005, 02:44 PM

Ya, I tested this again today. Still, in my eyes, a serious problem..

1 of 17 verified

921011660940941 | 55459*2^17205790+1 : Verified
921068079832919 | 4847*2^36387327+1 : No
921076620830321 | 55459*2^13850266+1 : No - should have
921105537390287 | 24737*2^15437407+1 : No - should have
921126590368079 | 22699*2^48725398+1 : No
921156430133813 | 55459*2^30391546+1 : No - should have
921165601914619 | 55459*2^45318190+1: No
921286274117719 | 4847*2^17870991+1: No - should have
921297091642001 | 4847*2^23853111+1: No - should have
921300547253111 | 4847*2^41540007+1 : No
921310300068559 | 55459*2^46574014+1 : No
921319782560293 | 21181*2^40936292+1 : No
921327573029173 | 33661*2^9838296+1: No - should have
921340761412397 | 55459*2^28030474+1: No - should have
921355297438757 | 55459*2^33958954+1 : No
921358184252041 | 24737*2^13649527+1: No - should have
921368678312893 | 24737*2^31790383+1: No - should have

Personally I think this is a major problem, people who are not paying attention that are using the 50MB dat will miss entering factors. Just one factor about 32M and the rest after it DO NOT VERIFY.

I've been removing the ones over 32M when posting now so they will show up and sending everything to factrange@yahoo.com

update: I entered the 9 'should have' after by themselves and they enter correctly, just remember anything over 32M will cause problems

9 of 9 verified in 0.61 secs.
9 of the results were new results and saved to the database.

vjs

10-12-2005, 03:13 PM

Well the ones less than 20M are very troubling. Could you e-mail Louie or Alien88 about this problem... thankfully we have a backup record with factrange@yahoo.com.

hc_grove

10-12-2005, 04:57 PM

I wrote Louie about two factors I found by P-1 that I can't submit at the beginning of this month, still haven't heard from him.

royanee

10-13-2005, 02:37 AM

Originally posted by vjs
It actually rings a bell.

Somewhere around that range is the minimum n for the EFF prize (money) winning prime. So in other words the limit may be a intentional program limit created by the coder etc...

If that were the case, then it would only apply to n >= 33,219,281. Of course, that's not considering k, so if k is 67607, then it would be n >= 33,219,265.

However, this probably isn't the case simply because the EFF prize winning prime only applies to prime numbers, so if we have a factor, then that possible prime k/n pair is now definitely not prime. I wonder what the admins are up to, they seem to be busy/absent recently, and they're really the only ones who can give us a heads up on what might be happening.

Joe O

10-15-2005, 09:56 AM

We are still accepting all factors at factrange at yahoo dot com.
Here is a picture of the most recent results:

Alien88

10-21-2005, 07:12 AM

Originally posted by hc_grove
I wrote Louie about two factors I found by P-1 that I can't submit at the beginning of this month, still haven't heard from him.

It should be fixed now; let me know if you're still having problems.

Greenbank

10-21-2005, 07:19 AM

It never ends does it...

How about it saying:

"k = one of the 8 left"

;-)

vjs

10-21-2005, 08:41 AM

Lets give Louie a break :spank: :D

Looks like they are all verified except for k 4847 of course

Max n tested was n~47M for k 21181

I'd say it looks good.

----------------------------

To all sievers please resubmit all of your ranges again to see what was missed etc. I'd also like to encourage people to keep submitting to factrange@yahoo.com for the time being.

Alien88

10-21-2005, 02:22 PM

Originally posted by vjs
Lets give Louie a break :spank: :D

Looks like they are all verified except for k 4847 of course

Max n tested was n~47M for k 21181

I'd say it looks good.

----------------------------

To all sievers please resubmit all of your ranges again to see what was missed etc. I'd also like to encourage people to keep submitting to factrange@yahoo.com for the time being.

Do I get a break? :jester:

vjs

10-21-2005, 02:41 PM

Nope :whip:

You've got a big list 9k to 8k is a trivial one.

:Pokes:

Then we can all :clap: and :drink: then :music: and :drums:.

But it's more possible that we should do things like, :bath: and :kiss: our :reddress: and :cell: our :geezer: .

With everyone asking about
- ques
- <50M
- error rates
- and web stuff

I bet your thinking that :stretcher is coming and :borg: isn't a bad alternative.

maniacken

01-08-2006, 06:18 PM

where at do i reserve a range for low p sieving?

Or could someone reserve a range and tell me.

Joe O

01-08-2006, 08:19 PM

where at do i reserve a range for low p sieving?

Or could someone reserve a range and tell me.
196000 197000 Maniacken

maniacken

01-08-2006, 08:31 PM

thanks joe. do i still need to submit factors to factrange@yahoo.com

Joe O

01-09-2006, 09:52 AM

thanks joe. do i still need to submit factors to factrange@yahoo.com
It would be preferable if you would. That way I am sure to get them.

maniacken

01-31-2006, 02:08 PM

196000 197000 Maniacken complete 3 factors below 20M 104 factors total.

Greenbank

02-08-2006, 10:10 AM

Can I get an 8T range please. Preferably on a T boundary but I'll take anything. As low as possible. :-)

vjs

02-08-2006, 12:26 PM

Not sure if Joe gave you a range yet...

There are potential missed factors around the 197T range, but I'll give you the lowest available as you requested.

125000-133000 Greenbank

PSP also has some ranges around 50T (I think) if your interested...

Either way I'll assume your happy with 125-133T, if not let me know.

------------------------

Also how are these coming

83000 84000 991<n<50M Reserved Greenbank
84000 85000 991<n<50M Reserved Greenbank

Greenbank

02-08-2006, 01:18 PM

125000-133000 Greenbank

Also how are these coming

83000 84000 991<n<50M Reserved Greenbank
84000 85000 991<n<50M Reserved Greenbank

125T-133T is perfect, I'm going on holiday soon and I don't want my machines to run dry.

83T and 84T were done back in November but my emails to factrange@yahoo.com were bouncing. I think the last range I sent was 948T to 948.5T.

If that is the case I have:-

83T to 84T (1T)
84T to 85T (1T)
962.9T to 963.4T (500G)
963.4T to 963.9T (500G)
1104T to 1105T (1T)
1105T to 1106T (1T)
1001T to 1002T (1T)

and in just over a week I'll have:-

986.4T to 988.4T (2T)
1003T to 1007T (2T)

Will bundle them up in the next few minutes and send them to the usual address.

Greenbank

02-08-2006, 01:36 PM

The zip files bounced again. Is there a problem with the factrange@yahoo.com email?

vjs

02-08-2006, 01:39 PM

No I don't think so Joe and I have e-mailed back and forth today.

I sent you another e-mail in a PM give that a try.

Mystwalker

02-08-2006, 05:47 PM

125T-133T is perfect, I'm going on holiday soon and I don't want my machines to run dry.

If you want, you can also try out the combined search for k's of both SoB and PSP. You would need this (http://www.ldausch.de/test/sievecomb.zip) sob.dat for it.

I don't know about speed reduction, but I'd guess it's somewhere from 30 to 40% - but you'll sieve 20 k's instead of merely 8 and save PSP to sieve the same range for the other 12 k's.

In addition, I'm interesting in the way your new version behaves when sieving 20 k's, as that's the way I sieve the last months (and I intend to continue this)...

Greenbank

02-09-2006, 04:01 AM

I got several PMs from people (ltd, hhh) about this too.

Not a problem to sieve for both, I sort of promised that whatever speed increases are obtained from new clients would be put towards combined sieving.

Mystwalker

02-09-2006, 06:23 AM

Thanks a lot, Greenbank! :banana: :thumbs:

ltd

02-09-2006, 11:03 AM

Many thanks from my side also.

:clap: :|party|: :cheers: :clap: :D

Greenbank

02-09-2006, 12:30 PM

Heh, thanks but I won't even start sieving that range for another 8 days or so.

Meanwhile there are plenty of people who've contributed more to PSP that me (not hard when you consider I've sieved 0T so far!) :-)

Mystwalker

02-09-2006, 01:28 PM

That's true, but you'll be definitely in the Top10 afterwards. And I hope that it inspires more 911-50M sievers to do the combined sieving. After all, PSP and SoB can help each other here. :)