Placebo shifting effect

**biwema** · 07-27-2003, 11:56 AM

Now when sieving, there are quite many factors and most of them are outside the active window. These factors do not score much now, but it will increase if the active window is moving and no prime is found.

Imagine: After one or two years the candidates are already sieved up to 500T or more and the probability of finding a factor is 10 times smaller.

While the active window is moving by 500000, the score of the people who sieved for example 100G @50T will increase as much as someone who sieves 1T @500T.
So the effort is 10 times as big to keep up with those who are active now. The sieved factors below the active window do score even less.
In 1..2 years the computers are maybe 2..4 times as fast, but it will be not enough just to increase the score as fast a an idle account which was active in summer 2003.

That might be somewhat discouraging.

It is a bit difficult to explain the point...
What do you think? Any comments or ideas?

(a little bit philosophical) biwema

**Slatz** · 07-27-2003, 07:41 PM

big picture...sieving is good for the project

little picture...most of us are doing DC for the stats because we are stats junkies...so I agree that anything that scews the stats over the long haul can be a bit discouraging for those that come in late

I am going to give the new stats method a little time to see how my score looks.

Slatz

**Nuri** · 07-27-2003, 08:20 PM

There is a simple solution for that.

Just introduce * p/40T to the formulas to enable fairness in sieving scores through time.

So, the formulas would look like;

A unique factor will score as follows:

p < 40T, score = p/1T (i.e. as before)
p > 40T, in 'active' window, 0 PRP tests performed, score = (n/1M ^ 2) * p/40T * 125
p > 40T, in 'active' window, 1 PRP tests performed, score = (n/1M ^ 2) * p/40T * 125 * 0.6
p > 40T, in 'active' window, 2 PRP tests performed, score = (n/1M ^ 2) * p/40T * 125 * 0.2
p > 40T, outside 'active' window, score = (as duplicate, see below)

A duplicate factor will score as follows:

score = p/100T, capped at 35.

Excluded factors (those factors not present after sieving 100<n<20M to p=1G) do not score.

Why p/40T?

Since we started from 40T, it will be equal to 1 at the beginning, and increase a p increase. Roughly speaking, there will be 10 times less factors at 400T (w.r.t. 40T), and each of these will score 10 times higher. (~=1/10*10 ~= equal scoring within sieve)

However, this will create an advantage towards P-1, since it generally finds factors of high p values. To avoid this, Mike can insert a cap to p/40T multiplier of, let's say 10.

This would still mean better scores for P-1 in the short run, but sieving factors will get much higher scores in the long run, simply because they will increase significantly when the active window reaches their n values.

**MikeH** · 07-28-2003, 08:39 AM

Just introduce * p/40T to the formulas to enable fairness in sieving scores through time.

However, this will create an advantage towards P-1, since it generally finds factors of high p values. To avoid this, Mike can insert a cap to p/40T multiplier of, let's say 10.

I quite like this idea, but with a moving cap that would be the point where sieving is considered (say) 90% complete.

Since P-1 factors are (almost) always found above this point, they would always get an upper bias, but it wouldn't be too much above sieving. This is then very fair, because at any given point in time factors from sieving or P-1 will score about the same. As time goes by, and as sieving progresses the bias will increase, but it will be for everyone.

Now I just need to figure out how to get this 90% point out of my gap finder automatically.

Mike.

P.S. I finally got the scores updating every 6 hours. In the table at the bottom the "change today" numbers might be a bit high until I have a full days history.

**Nuri** · 07-28-2003, 01:12 PM

Originally posted by MikeH
I quite like this idea, but with a moving cap that would be the point where sieving is considered (say) 90% complete.

Yes, something like that would be better.

I see that it's hard to automate the moving cap. So, just a couple of brainstorming ideas.... Hopefully some parts of them might help in the final formula.

How about finding the active range first, and multiplying it with a constant to reach the level that would serve as the cap?

(Please note that numbers below are arbitrary. I'm sure a better fit might be found).

-Take the list of p values of factors submitted within last 7 (?) days.
-Pool out highest (to avoid P-1) and lowest (to avoid DC sieve) 10%(?).
-Take the average of remaining 80%(?).

Considering that we're moving with ~300-400 G/day (last two weeks average is 337G/day), that would mean at the point of calculation, the average value of sieve is roughly 1.2T above that figure (=0.35T/day*3.5 days).

Based on how far you want the cap to be from the average of active main sieve range, add (or multiply) this figure with a constant to reach the cap figure in terms of p.

Also, keep track of the cap values for the last 3(?) days (or largest cap value to date). In case the calculated value for today is lower, simply use the other one (maximum of largest cap to date or calculated value today). This might prove useful in case a user sieves a large chunk (say 1T?) and submits his/her factors in a single batch when all is finished.

Or, alternatively,

- Take the list of p values of factors submitted within last 7 (?) days.
- Starting from the largest ones, search down until you reach a series of three p values, where the two gaps inbetween are smaller than the largest accepted gap level for that given p level, and pool out the ones larger than your series.
- Somebody might be sieving (let's say) 200T while the main activity is on 50T. So, to avoid a false alarm, take the 10th (20th?) percentile of the remaining p values.
- Again, add (or multiply) this figure with a constant to reach the cap figure in terms of p.
- It seems to me that, keeping track of max cap to date to avoid a one batch large range submission effect will not be necessary in this alternative.

Or, alternatively,

- Keep constant caps for a while, and increase them by 40T/40T (=1) chunks.
- Start with 2 (meaning we're safe until 80T).
- Increase the cap values manually

(i.e. when we all start reserving ranges above 70T, move it to 3. Then, when we reach 110T+, move it to 4, etc.)

That would mean a manual intervention every 3 to 4 months with the current sieve speed.

**Moo_the_cow** · 07-28-2003, 01:43 PM

I think that the cap should be quite high (at least 400T for now) so that people can still choose to sieve high ranges (remember that there are people who sieved ranges as high as 1P before P-1 factoring existed).

**MikeH** · 07-28-2003, 05:48 PM

Nuri, thanks for the ideas - very helpful.

I have managed to automate the finding of a particular % point. For information, currently the 90% sieved point is 44629G. That seems like a good point to work with, since most people are sieving in this area, and thus can contribute to pushing it upwards. Even for those that are sieving above this point, the cap will have moved up enough such that most of their factors will enter an active window at full bias.

The 95% point is currently at 35995. I figure this is not a good one to use because the incomplete ranges that are holding this back are held by only four people - thus the movement of the point is out of control of most sievers.

I think that the cap should be quite high (at least 400T for now) so that people can still choose to sieve high ranges (remember that there are people who sieved ranges as high as 1P before P-1 factoring existed).

I have really tried to level the playing field for sieving and P-1 factoring (and anything else that may come along). A side effect of this will be to discorage sieving high p values. From the perspective of the SB project as a whole, this is a good thing since the best place to sieve is low.

If I do apply this bias it will only be to ensure that new sievers are not discoraged because all the valuable sieve areas have gone.

Mike.

**biwema** · 07-28-2003, 07:13 PM

I think this cap is a good idea. Maybe it is better to synchronize this limit to the reserving border than the submitting border (or 90 % of them). the reason of that is, that if people start to reserve huge chunks (such as nuri now

) it might possible that only ranges beyond the cap limit are available otherwise.

If the cap is fixed for the time being, 2^48 would be a good point due to the border of sobsieve.

I also sieved high ranges before, but i think people will 'learn' when this scoring sheme is applied.

By the way: are the capped scores increasing when the limit of capping is moving?

(edit: typos corrected)

**Nuri** · 07-29-2003, 01:44 AM

What about 90% sieved level + 10T? (or +20T?, which I think would be better).

This will bring a buffer for safety and also avoid reserving very high ranges.

I think it would be no problem if some people were reserving ranges that are higher, but still within the proximity of the main playground. In the end, we will be reaching there within just two months anyway.

BTW, to make it clear, I am generally reserving roughly one month worth of ranges, which is close to 2T. This time, I reserved a bit earlier, because I will be going for a two week holiday next week and I wanted to make sure I can fill all of the queues of all of the PCs before I leave.

**MikeH** · 07-29-2003, 09:09 AM

By the way: are the capped scores increasing when the limit of capping is moving?

Yes they will while they are within an active window, they will slowly creep upwards. As soon as they fall out of a window that will be it until (if ever) the next window arrives. Beacuse of this I'll be changing the bit below the double check window to a 'completed' window, where the bias will not apply at all, otherwise anyone that's submitted big factors below 400K will see their score creep up every day for ever more.

Right now I am going with 90%, and I'd like to see how that plays out - I can always adjust it easily in the near future.

It'll take me a couple of days to iron out all the little issues (off-line) that implimenting this has caused, so it'll be a few days before you see the changes.

Mike.

**Nuri** · 07-29-2003, 05:37 PM

Mike,

Have you considered the scoring effect when a prime is found and the related k is removed from the list?

I haven't thought it in detail, but it seems to me that when this happens, the scores might drop in some cases.

For example, will we still be giving points for the k/n pairs of the cracked k? If we do so, the scores will not drop much, but this is against logic, simply because those k/n pair factors will become totally unnecessary for the project any more. On the other hand, if we stop scoring for the k/n pairs of the cracked k, then, for roughly 2 months (or for the time the n=500,000 range will be covered by the PRP clients), more factors will drop from the lower end of the active range w.r.t. the number of factors coming from the upper end. Therefore, the scores might drop.

The easiest solution that comes to my mind is that, ignore the scoring effect of the factors for the cracked k (when a prime is found, all factors for that k score 0), and multiply the constant in your formuas (which is 125 in the active range formulas), by a factor that is equal to "the total number of factors in the database divided by total number of factors for the remaining k values in the database, as of the date the sob.dat is changed".

To say it again, I haven't thought in detail, and might be wrong. But it seemed to me that it is worth considering.

**MikeH** · 07-30-2003, 08:15 AM

Have you considered the scoring effect when a prime is found and the related k is removed from the list?

Strangely enough I had been thinking about this yesterday - maybe this is a good omen.

I have a facility to freeze the score for any given candidate - that's what I used on all the candidates pre 21st July. So I can just freeze all candidates for a given k.

At that point, anything that is in or above the main active window will be useless, but since they were generated in good faith,it sems logical that the scores should remain.

Mike.

**Nuri** · 07-30-2003, 12:44 PM

Sure, that seems to be a good way to handle it. Of course, any factors for that k after the freezing takes place would be ignored, right?

**MikeH** · 07-30-2003, 03:43 PM

Sure, that seems to be a good way to handle it. Of course, any factors for that k after the freezing takes place would be ignored, right?

That's the plan. Right now there's no code to do that, and my hope is that Louie will stop accepting factors for that k, but maybe I need to add something too.

Mike.

**MikeH** · 07-30-2003, 05:58 PM

I've implimented the p/40T bias as discussed above to the scoring. You should see the effect in the next update in about 5 hours time. Anyone with factors in a current window with p<40T will see their score move up a little.

The current 90% sieve point will be displayed along with the max bias, and the individual score cards will show the bias required to achive the max score.

Mike.

**MikeH** · 08-08-2003, 09:59 AM

Right now I am going with 90%, and I'd like to see how that plays out - I can always adjust it easily in the near future.

After a week or so of watching this, I am going to stick with 90%. It feels about right. Most people are sieving within the range, or at very least by the time people are completing their ranges the 90% point has caught up with them.

The eagle eyed among you may also have noticed I changed the conditions for scoring in the 'main' window. Now only factors with 0 PRPs are able to score. This gets around a problem that occurred last week where the next.txt moved down significantly (below 4M), due to expired tests. As a result, some factors moved into the window and were scored which hadn't saved a PRP test.

**Mystwalker** · 08-08-2003, 10:08 AM

Seems like there are some double encapsulated special characters in the main score page - like "0M&ltn< 1M"...
Apart from that, I really like the new layout.

**MikeH** · 08-15-2003, 04:28 PM

Another couple of sieve scoring changes (as if wasn't complicated enough).

I've added the 'n upper bound' point. If a factor enters a window between the bottom of the double check window and the 'n upper bound' which has PRP=0, it will score as if it was in a active window immediately (but it won't increase from there onwards). This means that anyone that submitted a factor in the main PRP test window where the test was abandoned will receive full score at that point. This also means that anyone finding factors for those candidates designated for secret will also receive full score.

The following users will see their score increase as a result (having by luck eliminated secret tests)
MikeH
mklasson
biwema
Mat67

The following people have a nervous time waiting to see if a PRP test is returned.
ceselb
MikeH
biwema

I've also changed the point at which a score stops increasing. Any factor below the main active window which has received a score uplift will receive no further uplift. This basically means that once a factor has scored for eliminating all PRP tests, it's score won't jump when it reaches the double check window (very unlikely anyhow).

I'm also reducing the size of the main and double check windows - down to 200K and 100K respectively. However, since I don't want the top of the window to jump, the window sizes will slowly decrease until they get down to the new size. The logic for decreasing is simply to match the top of the main window with where P-1 factoring should be, giving sievers and factorers the same chance of claiming the highest score when they submit. This won't affect anyone's scores, since any useful factor will still enter the window, and is at it's most useful as it exits the main active window.

Are there any remaining unfair scenarios that I'm now not covering?

Mike.

**Keroberts1** · 08-16-2003, 05:08 PM

has it been determined if a test that is abandoned has a factor that a new one won't be assigned? also it would maybe be nice to have a updated list of factors that have been found that currently have PRPs active so perhaps users cou8ld manually release them and save themselves the work.

**MikeH** · 08-17-2003, 09:47 AM

also it would maybe be nice to have a updated list of factors that have been found that currently have PRPs active so perhaps users cou8ld manually release them and save themselves the work.

It can now be found on the sieve project stats page. As I write this, there is only one potential candidate.

The following people have a nervous time waiting to see if a PRP test is returned.

...well 3 of the 4 factors have now had PRP tests returned.

**Keroberts1** · 08-17-2003, 12:44 PM

it'll still be fun to watch

**MikeH** · 08-25-2003, 05:07 PM

What about this for coincidence. I was just looking at the Factors found where PRP test may be ongoing, thinking "gosh this has now grown to seven candidates - on the off chance I'll check what my PCs are currently crunching". I opened the first SB client and there it was staring me in the face

k=5359, n=4387702, 12.7% complete.

and

5359 4387702 Mon 25-Aug-2003 Xrillo

So the good news is that it was only 12.7% complete, and since the factor was found today, I've probably only done 1% since then (this PC is also doing some sieving). The other pieces of good news are we now get a chance to confirm that this test won't be re-assigned, and assuming it doesn't, Xrillo you'll get full score when the 'n upper bound' catches up!

I've cleared that job (but not expired it the pending test management form - let's make it look like a regular 10 day exipry), and got a new one, so I guess the positive way to look at it is that's 87.3% of a test saved.

Louie, when you're next looking at changing the client to server comms, maybe it would be a good idea to add in a "abandon test - factor found". Still not a big deal now, and I know it is very very rare, but when we're up to 20+ day PRPs on a fast P4 it could help a lot.

Mike.

From an earlier post, just noticed how fast the sieving is moving, less that one month ago 90% sieve point was 44.629T, now it's 57.820T.

100T before Christmas?

**Xrillo** · 08-25-2003, 06:33 PM

**mklasson** · 08-25-2003, 06:34 PM

MikeH,

those new stats are way cool! Factors / range, score, and so on. Now I can finally stop writing the number of factors found after [complete].

**Keroberts1** · 08-25-2003, 10:45 PM

definatly 100 T before christmas I'm almost done with my 2 t range and I'll be takin another one very soon.

**MikeH** · 08-26-2003, 07:56 AM

those new stats are way cool! Factors / range, score, and so on. Now I can finally stop writing the number of factors found after [complete].

Thanks, I did wonder if anyone scrolled down that far!

Unfortunately I seem to have broken the detection of new factors in the process.

I'll take a look at it tonight, I guess it's just something simple.

**smh** · 08-26-2003, 09:42 AM

Not sure if it's the right topic, but iwas just looking at http://www.aooq73.dsl.pipex.com/scores.htm in teh largest factors section.

I'm missing a few very large factors there. I guess they were never reported because they were to large.

In http://www.free-dc.org/forum/showthr...ctor#post23680 i reported a 45 digit factor of 28433*2^265+1

I think this is the largest found so far.

**MikeH** · 08-26-2003, 01:07 PM

I'm missing a few very large factors there. I guess they were never reported because they were to large.

I guess since the factor is bigger than 2^128, it can't even be submitted on the big factor submission form. I have no problem with recording this (any any others bigger than 2^128), but how do I validate the result? (not that I don't trust you

, more as a general case)

**mklasson** · 08-26-2003, 02:23 PM

MikeH, try http://www.swox.com/gmp/#TRY
and enter something like
(28433*2^265+1) mod 16123791857472253188125709918127714272097

**MikeH** · 08-26-2003, 04:32 PM

mklasson, thanks for that, it works even with smh's 45 digit factor.

smh (and JoeO), have you submitted all that can be submitted? If so, I'll take a look through those pages and create a local file with those that aren't in results.txt.

I've fixed the scores (should sort itself out in the next update), and added (yet more) detail at the end of the user score cards. Let's see what I've broken this time.

Mike.

**Joe O** · 08-27-2003, 09:57 AM

MikeH, There was an old post, way at the beginning, with some large factors. Anyway, here is a file with what I have. The date on the file is 2003-07-23.

**smh** · 08-28-2003, 03:20 PM

smh (and JoeO), have you submitted all that can be submitted?

I couldn't really remember, but reading the old post, it seemed that i didn't submit any of the factors i found, but left it for others (JoeO?) to do so.

Those factors aren't really worth a lot. It took much longer to find them than to do a prp test on such a (relitavely) small number. I don't care much about the credit, was just wondering why they didn't show up.

Thread: Placebo shifting effect

Thread Tools

Rate This Thread

Display

Placebo shifting effect

Posting Permissions