P+1 factoring -- brainstorming

**jjjjL** · 07-10-2003, 04:46 PM

Here's a repost from another thread:

Posted by Joe_O

wblipp used 2^4 == 1 (mod5) and 15==0 (mod 5) to show that all remaining candidates are == 2 (mod 5)
i.e. 2^4m+15i*2^(4m-j)+1==1+0+1==2 (mod 5)

similarly 2^2==1 (mod 3) and 15==0 (mod 3) show that all remaining candidates are == 2 (mod 3)

also 2^4 == 1 (mod 15) and 15 == 0 (mod 15) show that all remaining candidates are == 2 (mod 15)

So if Q is a candidate:
Q = 5*l1+2 where l1 is an integer. note that if l1 is even then Q==0 (mod2) so we only need to look at l1 an odd integer

similarly
Q=3*l2+2 l2 an odd integer

Q=15*l3+2 l3 an odd integer

Now you ask why am I posting this? Well for one thing Q=3*l2+2 gives us Q+1=3*l2+3 == 0 (mod 3)
This might mean that P+1 factoring might give us an advantage since 3 will always be a factor.
And maybe someone can use the fact that all candidates are of the form 15*l3+2 to come up with something.

Posted by Me

I factored a few P+1 values and didn't find 3 in all of them. Perhaps your observation about candidate form don't tell us as much about factor form as you might have hoped.

However, I've been thinking about P+1 factoring too. What would make it really efficient would be if we could make it non-probablistic. If we knew in advance a value S where the Jacobi[S^2-4,P] = -1 we could make the P+1 factoring work each time instead of ~50% with random S seeds.

It may be unrealistic to expect that we could guess an S^2-4 that is a quadradic non-residue mod P without knowing P in advance but perhaps the fact that P is prime will help (meaning the Jacobi symbol is techincally a Legendre symbol). Also, SoBSieve uses similar identities to double it's efficiency by calculating the Kronecker[N,P]. Of course, in that case, P is known. I'm gonna look into it.

**jjjjL** · 07-10-2003, 05:04 PM

First, let me explain more clearly why P+1 factoring is probablistic. Here's from the GMP-ECM readme file

<snip>

Code:

The P-1 method works well when the input number has a prime factor P such
that P-1 is "smooth", i.e. has all its prime factor less or equal the 
step 1 bound B1, except one which may be less or equal the second step
bound B2. For P=67872792749091946529, we have P-1 = 2^5 * 11 * 17 * 19 *
43 * 149 * 8467 * 11004397, so this factor will be found as long as B1 >= 8467
and B2 >= 11004397:

$ echo 67872792749091946529 | ./ecm -pm1 -x0 2809890345 8467 11004397
GMP-ECM 5.0 [powered by GMP 4.1.2] [P-1]
Input number is 67872792749091946529 (20 digits)
Using B1=8467, B2=11004397, polynomial Dickson(6), x0=2809890345
Step 1 took 10ms
Step 2 took 320ms
********** Factor found in step 2: 67872792749091946529
Found input number N

There is no need to run several times P-1 with the same B1 and B2, like
for ECM, since a factor found with one seed will be found by another one.

The P+1 method works well when the input number has a prime factor P such
that P+1 is "smooth". For P=4190453151940208656715582382315221647, we have
P+1 = 2^4 * 283 * 2423 * 21881 * 39839 * 1414261 * 2337233 * 132554351, so
this factor will be found as long as B1 >= 2337233 and B2 >= 132554351:

$ echo 4190453151940208656715582382315221647 | ./ecm -pp1 -x0 2284918860 2337233 132554351
GMP-ECM 5.0 [powered by GMP 4.1.2] [P+1]
Input number is 4190453151940208656715582382315221647 (37 digits)
Using B1=2337233, B2=132554351, polynomial x^1, x0=2284918860
Step 1 took 3300ms
Step 2 took 2460ms
********** Factor found in step 2:
4190453151940208656715582382315221647
Found input number N

However not all seeds will succeed: only half of the seeds 's' work for P+1
(namely those where the Jacobi symbol of s^2-4 and P is -1.) Unfortunately, 
since P is usually not known in advance, there is no way to ensure that this 
holds. However, if the seed is chosen randomly, there is a probability of 
about 1/2 that it will give a Jacobi symbol of -1 (i.e. the factor P will 
be found if P+1 is smooth enough). A rule of thumb is to run 3 times P+1 
with different random seeds.

</snip>

So not only is it possible have a P+1 test fail because the seed was wrong, but it is also impossible to run enough seeds so that you are 100% sure it worked. In other words, runing 1 seed gives you a 50% chance of success, a second gives you 75%, a third 87.5%, etc.

It seems that our numbers may be able to avoid this probablistic nature somehow due to their unique form. The best thing would be if there were a seed that always worked for all proth numbers. This is kinda unlikely. Perhaps a seed can be calculated for individual proth numbers based on their k/n values that will always work. That's more likely to work. Failing those ideal cases, it seems possible that a pair of suitably dissimilar seeds could be found that would capture 50% of the the factors exclusively so that the sum is 100%. That would mean that two tests would be sufficient to prove the numbers are P+1 smooth.

In a simple attempt to see if there is some obvious way to make P+1 factoring non-probablistic for our numbers, I made a program to look for an approriate seed value that would always work for our factors. No real results yet. As expected, most give exactly 50% and when I combine the power of two values, they always seem to be 75% as well.

I think it is much more likely that the value will depend on a more intelligent rule that takes into account the k/n value being factored. Next I plan to analyze the data for values that fall into different residue classes mod 48.

If someone has any insight, feel free to join in the discussion.

-Louie

**jjjjL** · 07-10-2003, 05:15 PM

Originally posted by jjjjL

Code:

However not all seeds will succeed: only half of the seeds 's' work for P+1 (namely those where the Jacobi symbol of s^2-4 and P is -1.) Unfortunately, since P is usually not known in advance, there is no way to ensure that this holds. However, if the seed is chosen randomly, there is a probability of about 1/2 that it will give a Jacobi symbol of -1 (i.e. the factor P will be found if P+1 is smooth enough). A rule of thumb is to run 3 times P+1 with different random seeds.

BTW, I think there is a mis-print in the documentation. When it says the Jacobi symbol of s^2-4 and P is -1, I think it meant to say that the Jacobi symbol of s and P is -1 and the gcd of s^2-4 and N is 1 where N is the number being factored.

The code that picks the seed just picks random 32-bit numbers (stored as s) and tests for the gcd(s^2-4,N) != 0. When it finds one, it returns the value s, not s^2-4. It's the value s and P that need to be quadratic non-residues, not s^2-4 and P. Whoever wrote the readme must have goofed and jumbled the two rules together.

-Louie

**Moo_the_cow** · 07-10-2003, 06:20 PM

I can't understand half of the math here, but is P+1 factoring supposed to be more efficient than P-1 factoring?

P.S. I think that this thread should also be put in the factoring subforum.

**mklasson** · 07-10-2003, 06:51 PM

Sounding pessimistic, I have to repeat what Paul pointed out to me when I was enthusiastic about p+1.

As p-1 factoring is faster than p+1, there may not be a good reason to run p+1 at all even if you somehow manage to find seeds that work 100% of the time. It's not like we've got a shortage of pairs to try p-1 on. Doing p-1 on a new pair should give you a better chance of finding a factor than doing p+1 with the same cpu power, as you can go deeper cheaper with p-1. In short, it might be better to just do more p-1 instead of trying p+1.

**jjjjL** · 07-10-2003, 09:56 PM

Originally posted by mklasson
As p-1 factoring is faster than p+1, there may not be a good reason to run p+1 at all even if you somehow manage to find seeds that work 100% of the time. It's not like we've got a shortage of pairs to try p-1 on. Doing p-1 on a new pair should give you a better chance of finding a factor than doing p+1 with the same cpu power, as you can go deeper cheaper with p-1. In short, it might be better to just do more p-1 instead of trying p+1.

I agree with what you're saying.

P+1 will not be as fast as P-1 factoring. Since it involves multiplying lucas sequences instead of squaring it just won't run as fast. Factoring speeds will be more like ECM instead of P-1.

That said, there's no reason to believe that it will run so slow that P+1 factoring will never make sense. It will obviously never make sense to factor deeper in the P+1 direction. But at some n value, it may make sense to do a little P+1 factoring.

It's important to realize that the only reason we don't have P+1 factoring right now is because GIMPS factors are of a special form (2*exponent*prime+1) that makes them much more likely to be smooth for P-1 than P+1. It seems as though our factors show no similar bias to P-1 smoothness so we can at least say that even if GIMPS has no use for it, we may.

So even when P+1 factoring makes sense, it will probably only need 1/10th or less of the computing power devoted to it that P-1 does so it will not represent a large burden to run. We'll see.

-Louie

**mklasson** · 07-10-2003, 11:09 PM

And I agree with you... I keep forgetting our goal isn't to find factors, but rather to verify nonprimality of _all_ pairs (well, not really) by fastest means possible.

In the same spirit then, is no ecm at all prior to prp-ing really optimal? Got any numbers for factor finding probabilities there?

Also, Louie, could you comment on my earlier question to you about lack of factors > 2^64 in results.txt? Can I find them somewhere else?

**jjjjL** · 07-11-2003, 12:14 AM

Originally posted by mklasson
And I agree with you... I keep forgetting our goal isn't to find factors, but rather to verify nonprimality of _all_ pairs (well, not really) by fastest means possible.

In the same spirit then, is no ecm at all prior to prp-ing really optimal? Got any numbers for factor finding probabilities there?

Also, Louie, could you comment on my earlier question to you about lack of factors > 2^64 in results.txt? Can I find them somewhere else?

it's doubtful the ECM factoring will be useful right now.

the large factors are there now. i moved the results.txt to http://www.seventeenorbust.com/sieve/results.txt.bz2.

there is a line of headers before the (cureent) 3 largefactors. just look at the very bottom of the file.

-Louie

Thread: P+1 factoring -- brainstorming

Thread Tools

Rate This Thread

Display

P+1 factoring -- brainstorming

Posting Permissions