SOURCE CODE RELEASED FOR FACTORER

**jjjjL** · 01-28-2004, 06:36 AM

http://linux.redbird.com/~alien88/sbfactor125source.zip

Here's the scoop. I'm posting the full source for the factoring code. Everything. With this, anyone could compile a brand new, working factorer. Why?

Well, I think there may be a few folks out there who are interested in seeing how it works. I also think there are a few who might want to add to it and that's cool too. Perhaps there are even a few skilled programmers out there who could optimize parts of it. That would be great.

For starters, this code is for version 1.25 of SBFactor. There is a windows binary of the new SBFactor in my UM directory too if you want it. The only real added feature is the ability to function on numbers of the form k*2^n-1 for our pals working on the Riesel problem. I completed the extra code that does all that stuff today.

Just to get it out of the way in advance, I'm sure there are some loose ends in the code I could have shored up but I didn't want to tinker to death with it and never get it out so... here.

There are many places where I have done things without commenting, many more where I have commented out code for no apparent reason. There is excessive use of globals (mostly a holdover from previous code written by GW). And just to top it all off, the whole mess compiles as one C file with other C files included inside it. I know. It's a tad insane. But honestly, I think the code is still pretty easy to navigate once you work with it for awhile. I don't mean to scare people off cause it's really not that bad. I just need to appologize because it's not the finest example of C code. Don't put this stuff in books and tutorials people!

Formalities and appologies aside, there are some real opportunities here for people to get involved in a big way. For one, all the asm code is included in this zip so if someone wanted to try and optimize it to increase speed, please, go right ahead. If the optimizations work, the ramifications are huge. First off, SB's internal math routines are compiled from the same exact ASM files. This means that if you find a way to make faster ASM objects, the SB client itself will experience the same exact speed increase as the factorer will. Also, your optimization may be applicable to a bunch of other projects that use this code as well like GIMPS and LLR.

Other interesting possibilities include (but are certainly not limited to):
-Prettying up the code so it's not so GodAwful(tm)
-Writing a GUI for the factorer. Even a simple MFC would help draw more users into using it than the current CLI only interface I made.
-Add better timing and rate reporting. Right now I just dump status reports with a % finished every now and then. If more accurate progress indicators existed, it would make it easier for people who want to optimize the code to see if their improvements are making a difference.
-Add threading! It probably wouldn't even be that hard to do. Could really help those of us with multiple processors or HT. You could probably even time things so no more than 1 thread was in stage 2 of a test so that there wouldn't be undue pressure put on the memory.
-Make a better configuration file. Errr.. make one. Right now it's all command line option driven. Even with a GUI, it may be nice to have a config file.
-Write some documentation... there is currently no formal docs for the factorer. Almost all knowledge about this program is stored in the pages of this forum. A "HOW-TO" for new users would be nice. A readme for developers could be helpful. Even better usage info on the command line would probably be good.
-Extend the program to handle other number forms. It took me a total of 3 hours today to code, test, and explain to a new set of users how to use a new version of this code that tests different numbers. Want to factor numbers of the form k*3^n-5? You could probably be doing it in a couple minutes if you just modify this code.
-Did I mention you could try to optimize the code for speed?

Seriously though, there's probably still some low hanging fruit in this orchard. One of the reasons I'm releasing this code is so that I can focus on this area instead of doing the other things, which are also important, like making a GUI, etc but I don't have time to do right now.

Also, another benefit is that getting familar with how the base of this code works will help prepare anyone who's going to help develop v3 of the client. Even though there are significant differences in how the main client code works, the base of ASM objects and the squaring routines derived from them is the exact same. Plus it will give people a chance to start interacting with Me and Dave a bit before we open client developement. Give us an idea of who might want to do what in that arena.

So download the code already. If there's interest, I can setup a public cvs server for all this. Hey, the code's out there, if you wanted, *YOU* could start the CVS for this code too. Anyting is possible.

Cheers,
Louie

http://linux.redbird.com/~alien88/sbfactor125source.zip

**hc_grove** · 01-28-2004, 10:01 AM

How do you compile this on LInux? When I unzip it and run make, I get a lot of errors and warnings, here are the first few lines:

Code:

gcc -Wall -O2 -m486 -malign-loops=2 -malign-jumps=2 -malign-functions=2   -c -o 
prime.o prime.c
`-m486' is deprecated. Use `-march=i486' or `-mcpu=i486' instead.
cc1: advarsel: -malign-loops is obsolete, use -falign-loops
cc1: advarsel: -malign-jumps is obsolete, use -falign-jumps
cc1: advarsel: -malign-functions is obsolete, use -falign-functions
prime.c:5:21: windows.h: Ingen sådan fil eller filkatalog
prime.c: I funktionen 'isHighResTimerAvailable':
prime.c:11: error: `LARGE_INTEGER' undeclared (first use in this function)

Well replacing '-march=i486' with '-march=pentium4 -mcpu=pentium4' and changing those m's to f's are easy, as windows.h are included in factor.c if __linux__ is defined, I guessed that I didn't need it and removed that line, but the lack of a declaration of LARGE_INTEGER, I couldn't guess what to do about.

I'm using gcc 3.3.2 from Debian testing/unstable.

BTW: There really should be a license attached, to clarify what might be done with the code.

**pixl97** · 01-28-2004, 03:42 PM

>I guessed that I didn't need it and removed that line, but the lack of a declaration of LARGE_INTEGER, I couldn't guess what to do about.

RunTime: High-performance programming techniques on Linux and Windows 2000.

http://www-106.ibm.com/developerworks/library/l-rt1/

Maybe this will help?

**jjjjL** · 01-28-2004, 04:27 PM

this code was last built on windows and i didn't save the defines I added to compile in linux.

yeah, program all those functions in prime.c with LARGE_INTEGER to just return 0 in linux. they are of no importance. that's how the regular linux version works. I only included them to save me from having to modify more of woltman's code.

you'll probably also want to not include windows.h but i'm sure you excluded that as well.

Cheers,
Louie

**jjjjL** · 01-28-2004, 04:56 PM

Originally posted by hc_grove
BTW: There really should be a license attached, to clarify what might be done with the code.

I don't really care for licenses. You'd have to convince me that there was a good reason for adding one.

I don't care if someone makes a closed source version of the factorer.

I don't care if someone makes a version of the factorer that's incorrect or doesn't work (because the verifier script won't accept a bad factor).

I don't care if someone uses it in a commercial application... because if one existed, they'd probably just hire me to help them make it work anyway.

Aren't those the standard rights that free source people take away from others with licenses?

I'm totally honest here, if there's a good reason to add a license, I'll do it. But from what I know about them, it doesn't seem to make sense in this case. But I could be wrong. Let me know.

Cheers,
Louie

**hc_grove** · 01-28-2004, 05:05 PM

Modifying those functions to just return 0 helped. Apart from some warnings I now only get the following:

Code:

prime.o(.text+0xa713): In function `is_valid_double':
: undefined reference to `eisvaliddouble'
gw/mult.o(_TEXT32+0x5f): In function `_gwinfo1':
: undefined reference to `_CPU_L2_CACHE_SIZE'
collect2: ld returned 1 exit status
make: *** [sbfactor] Error 1

I don't understand the first one there is no is_valid_double function i prime.c.

**jjjjL** · 01-28-2004, 05:22 PM

Originally posted by hc_grove
Modifying those functions to just return 0 helped. Apart from some warnings I now only get the following:

Code:

prime.o(.text+0xa713): In function `is_valid_double': : undefined reference to `eisvaliddouble' gw/mult.o(_TEXT32+0x5f): In function `_gwinfo1': : undefined reference to `_CPU_L2_CACHE_SIZE' collect2: ld returned 1 exit status make: *** [sbfactor] Error 1

I don't understand the first one there is no is_valid_double function i prime.c.

Yeah, I remember those errors too. just add a function in prime.c

eisvaliddouble { return 1 }

it's called from an asm object.

and go into cpuid.c and change all instances of CPU_L2_CACHE_SIZE --> _CPU_L2_CACHE_SIZE. I don't know why but the current version of gcc no longer mangles the internal variable name predicatably. This means the ASM function can't find it.

I think those are the last errors before it will compile. A note: for extra speed you want to make sure mult.o is aligned on a 32-bit boundry. After it compiles, open up sb.map and search for mult.o look at the address it puts it at... it should be divisible by 32 (20 hex). So if it's address doesn't end in 00, 20, 40, 60, 80, A0, C0, E0 you need to make it do that. Do that by editing the makefile and changing the dummy8.o file to like dummy14.o or whatever offset you need to achieve proper alignment.

Don't worry about where the text segment is aligned. that's the second address you'll find mult.o (third instance of the mult.o word) listed under in the map file. it's not as important. Just worry about the first address (second instance of mult.o in the map file).

Cheers,
Louie

**hc_grove** · 01-28-2004, 06:16 PM

That made it compile. And it seems like it would run too. There does seem to be one little change from 1.2, I have to add a + before the amount of RAM to make it run, i.e. where I would have used

sbfactor 5701350 5701360 47 1.5 128 with 1.2, I have to use
sbfactor 5701350 5701360 47 1.5 + 128 with 1.25.

Thank you very much for helping me getting this compiled, now I'll have to see if I improve it in any way.

**jjjjL** · 01-28-2004, 06:32 PM

Originally posted by hc_grove
That made it compile. And it seems like it would run too. There does seem to be one little change from 1.2, I have to add a + before the amount of RAM to make it run, i.e. where I would have used

sbfactor 5701350 5701360 47 1.5 128 with 1.2, I have to use
sbfactor 5701350 5701360 47 1.5 + 128 with 1.25.

Thank you very much for helping me getting this compiled, now I'll have to see if I improve it in any way.

Yes, that + signals it to run on numbers of the form k*2^n+1 instead of k*2^n-1. I recently added that switch so our friends doing sieving for the Riesel project could start testing it out.

If it compiled than it's almost surely working correctly. Good luck improving parts of the code. Feel free to release binaries to show us how things are going if you want.

Cheers,
Louie

**Death** · 01-29-2004, 05:48 AM

http://www.free-dc.org/forum/showthr...&threadid=5245

Maybe now someone can do something with this?

Please, don't tell me "You are free to do this". Goddamit, I'm admin, not programmer.

And having 4 Mac G5 doing stupid RC5-72 I want something from 17oB for macintosh.. please.

**b2uc** · 01-29-2004, 12:04 PM

Originally posted by Death
http://www.free-dc.org/forum/showthr...&threadid=5245

Maybe now someone can do something with this?

Please, don't tell me "You are free to do this". Goddamit, I'm admin, not programmer.

And having 4 Mac G5 doing stupid RC5-72 I want something from 17oB for macintosh.. please.

If you would give Mklasson SSH access to one of thoes G5's for a short time...I know from personal conversation he would love to write a sieve client for it just to see what it can do. Don't know about the work required to get it done though.

So...Mklasson...when you read this...I challenge thy to make a sieve for the G5

**pixl97** · 01-30-2004, 04:40 PM

Ok, got sbfactor to compile on my linux box today, one thing I have noticed is if the SoB.dat or results.txt is missing you get a Segmentation fault. It would be nice to have a error message stating that %file is missing cannot continue. I'll attempt to code that myself (not much of a c coder), where do I send/post any changes to the code?

pixl97

**hc_grove** · 01-30-2004, 05:05 PM

Originally posted by pixl97
Ok, got sbfactor to compile on my linux box today, one thing I have noticed is if the SoB.dat or results.txt is missing you get a Segmentation fault. It would be nice to have a error message stating that %file is missing cannot continue. I'll attempt to code that myself (not much of a c coder), where do I send/post any changes to the code?

pixl97

Don't waste your time. I've just finished coding a version that uses a configuration file, and fixes these. When I've done some testing (anyone who knows a k/n-pair for which a factor can be found using ECM?) I'll put it up for download, and your time will be much better spend testing that.

But else, I guess you'd send small changes to Louie or upload a fixed version somewhere and tell people they can find it there.

**pixl97** · 01-30-2004, 05:28 PM

I already fixed the old code on my box before I got your message, but hey, I learned something new. I am ready and willing to test the new code

____
edit

factor.c:193
- printf("Searching for known factors in results.txt...");
- f = fopen("results.txt", "r");
-
- fgets (one_line, 100 , f); // remove top lines of result file

+ printf("Searching for known factors in results.txt...\n");
+ f = fopen("results.txt", "r");
+ if(!f) {
+ perror("fopen");
+ printf("Fatal error opening results.txt, cannot continue.\n");
+ exit(EXIT_FAILURE);
+ }
+ fgets (one_line, 100 , f); // remove top lines of result file

#include <sys/errno.h> is necessary also.

SoB.dat and riesel.dat @factor.c:153 should use the same code, but I have to look at the if/else statement to make sure it will work that way.

__

Failing to write to fact.txt may also cause a SEGFAULT (or it maybe failing silently, bad

), I havent spent the time debugging what occurs on a write failure, but this should show better behavior.

This has NOT been tested.

ecm.c:863
f = fopen("fact.txt", "at");
+ if(!f) {
+ perror("fopen");
+ printf("Fatal error writing fact.txt, cannot continue.\n");
+ exit(EXIT_FAILURE);
+ }
fprintf(f, msg);

It would be best to print out what could not be written to fact so there is a chance of the user seeing it and submitting it.

**hc_grove** · 01-30-2004, 05:40 PM

Originally posted by pixl97
I already fixed the old code on my box before I got your message, but hey, I learned something new. I am ready and willing to test the new code

Okay. Maybe you should upload it somewhere anyway, so that people who find the segfaults annoying, and don't like my (still unpublished) changes.

**hc_grove** · 01-30-2004, 06:26 PM

Okay, so now my version can rediscover known factors (it really help to know some very smooth factors), and generally doesn't crash too unexpectedly.

If you want to try it out:
My version of sbfactor can be found here!

There are a few different files in the directory:

factor.c - The important source file (the one with interesting changes)
README - A tiny bit of documentation (only documents exit codes at the moment)
sbfactor - A binary version for Linux
sbfactor1251.tar.gz - A .tar.gz which should contain all the sources in a version ready to compile on Linux
sbfactor125source.zip - The source as it looked before I touched it
sbfactor.cfg - A sample configuration file
smooth_factors - The five smootesh factors I've found, useful for testing the ability to discover factors

I don't have any windows machines so I have no idea whether it compileable on windows, but I suspect it will.

Now to a description of the changes:

It shouldn't segfault when missing files
It supports running at different priorities (on Linux - don't know how to do that on Windows)
There's a configuration file
And I have a feeling it forgot something

Thanks to the configuration file, at lot of things became possible:

Running a range of tests with manually set bounds
Running a range of test with ECM
Testing of numbers with small k's and n's or large k's (Make it possible to run really fast tests - it only takes a few seconds to use P-1 factoring to discover that 6541 | 123*2^45+1, but that could probably be done faster by other methods, so don't overuse this)
Probably more that I just haven't though of

About the code:

Less code dpulication
More comments - C-style, as this is C, and I find those prettier to look at
Easier to read

If you set up the configuration file to do a single test, you run

Code:

sbfactor <k> <n>

and in case you want to test a range you run

Code:

sbfactor <n low> <n high>

. In both cases you can optionally provide the name of the configuration file to use (default is sbfactor.cfg) as a third parameter.

Please try it out!

BTW: If anybody is able to determine what parts of the code is mine, they are under a beerware

license (see Poul-Henning Kamp's homepage for a description).

**hc_grove** · 01-31-2004, 06:32 AM

There a new version of my sbfactor! (Actually the version number is up from 1.25.1 to 1.25.3, the .2 version was available for a few minutes, before I remembered that I found a bug but hadn't fixed it

).

Now factoring of Riesel numbers actually have a chance of working (it found out that 5 | 123*2^45-1, but a relevant test would be nice).

Also manually set bounds are checked for (a minimum of) sanity.

Now aborts if trying to use optimal bounds with ECM (optimal bounds calculation is only for P-1).

The following needs serious testing that I can't (effectively) perform:

ECM factoring (needs to know a factor and the parameters that will discover it)
Factoring (both P-1 and ECM) of Riesel numbers (needs to know factors)

And now there's a zip-file too for those windows users that can't figure out what to do with a tar.gz.

If you find error, annoyances (that aren't present in Louie's version) or have reasonable suggestions for extra features post here.

**jjjjL** · 01-31-2004, 08:43 AM

Originally posted by hc_grove
Now factoring of Riesel numbers actually have a chance of working (it found out that 5 | 123*2^45-1, but a relevant test would be nice).

Also manually set bounds are checked for (a minimum of) sanity.

Now aborts if trying to use optimal bounds with ECM (optimal bounds calculation is only for P-1).

The following needs serious testing that I can't (effectively) perform:

ECM factoring (needs to know a factor and the parameters that will discover it)
Factoring (both P-1 and ECM) of Riesel numbers (needs to know factors)

And now there's a zip-file too for those windows users that can't figure out what to do with a tar.gz.

If you find error, annoyances (that aren't present in Louie's version) or have reasonable suggestions for extra features post here.

Here's a post that details some of the tests I've used for Riesel numbers http://rieselsieve.com/phpBB2/viewtopic.php?t=181 . v1.25 has rediscover quite a few known factors and it has been used to discover several other new ones at about the estimated rate that it should. It seems unlikely to me that there would be a catastropic error in the Riesel code.

As for ECM, the simple answer is that it is not reasonable to use ECM to find any factors right now. The code that discovers factors using ECM, although correct, is too slow to compete with the speed of P-1 or just running 2 full PRP tests in terms of efficiency. Try to find a copy of ECMNET (or gmp-ecm) and read the relative parts of their README for a crash-course on ECM factoring.

If you really want to test ECM seriously, you'll have to (re)enable the ability to enter a specific sigma. Without this, the test becomes non-reproducable. Also, I believe some examples of very low numbers (n ~ 100 - 1000) exist which were factored using this method in the factorer thead. There are some numbers in the largefactors table for n just greater than 1000 that I think were done using this method. Talk to Joe_O... it's all his fault

That said, it's cool to see you're making strides to improve the code. If there are other things I can do to help, let me know.

Cheers,
Louie

**hc_grove** · 01-31-2004, 09:11 AM

Originally posted by jjjjL
Here's a post that details some of the tests I've used for Riesel numbers http://rieselsieve.com/phpBB2/viewtopic.php?t=181 . v1.25 has rediscover quite a few known factors and it has been used to discover several other new ones at about the estimated rate that it should. It seems unlikely to me that there would be a catastropic error in the Riesel code.

I haven't touched the math parts of the code yet, so if I set all global variables (and I have gotten some of those wrong during my hacking) and call the right functions with the right parameters it should work, but, in my v1.25.1 I didn't set form correctly, so it would always be factoring Sierpinski numbers, and it's just errors of that kind I'm interested in finding.

I'll take a look at that url.

<edit>
And I just rediscovered the two factors:

Code:

2051479959439 | 65531*2^3853310-1
3253824966283 | 460139*2^1825400-1

so I think it's safe to say that I haven't broken P-1 factoring of neither Sierpinski nor Riesel numbers.
</edit>

As for ECM, the simple answer is that it is not reasonable to use ECM to find any factors right now.

I know that, but again it would be nice to rule out any bugs in how I call the functions.

If you really want to test ECM seriously, you'll have to (re)enable the ability to enter a specific sigma. Without this, the test becomes non-reproducable.

Well, that sounds like a little more work than I'm ready to do, so I'll just rely on the hope that I'm not too stupid to understand how the call is supposed to look.

That said, it's cool to see you're making strides to improve the code. If there are other things I can do to help, let me know.

First of all, you can take a look at it and tell me if you like or hate the changes I've done (you had guessed that, right?)

More importantly you can see if it's still compilable on windows, and perhaps implement priority selection for windows (it's probably mostly about copying code from the prp client, and I don't remember to have seen any bug reports about that part of the code).

And crossing the line from (probably) easy tasks to hard tasks:
(and that means that I don't expect you to do this right now)

Another thing that would make me happy is a code restructuring and a new Makefile so only that parts that have to be recompiled is, at the momemt I have to do a `make clean` when I've changed factor.c.

A code rewrite to eliminate the use of global variables would be really appreciated too.

Should I post in the 'P-1 factorer' thread to find more testers?

**pixl97** · 01-31-2004, 10:32 AM

09:30:47 (153.10 KB/s) - `sbfactor1253.tar.gz' saved [1419046/1419046]

[root@james sb]# tar xzf sbfactor1253.tar.gz
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

tried on 2 different boxes with the same result.

**hc_grove** · 01-31-2004, 10:52 AM

Originally posted by pixl97
# tar xzf sbfactor1253.tar.gz
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

I've uploaded a new version of that file, I've also uploaded a new copy of the zip file that actually contains the same files as the tar.gz.

EDIT:
Perhaps I should add, that these files contain three copies of the code for the main function.

org_factor.c - The original file with only the changes I had to make to make compile
factor.c - My version
myfactor.c - Another copy of my version - just because it was practical when hacking.

**hc_grove** · 01-31-2004, 12:18 PM

Originally posted by jjjjL
I don't really care for licenses. You'd have to convince me that there was a good reason for adding one.

I don't care if someone makes a closed source version of the factorer.

I don't care if someone makes a version of the factorer that's incorrect or doesn't work (because the verifier script won't accept a bad factor).

I don't care if someone uses it in a commercial application... because if one existed, they'd probably just hire me to help them make it work anyway.

Aren't those the standard rights that free source people take away from others with licenses?

Well those are rights often not granted by (more or less) free licenses.

But without a license the only rules that apply are normal copyright, and that doesn't allow me to do anything.

Therefore a license is needed to grant people the rights you want to give them.

As it seems I feel the same way as you about most of what I code, I usually use the beerware license, which is the simplest license I've seen. To the best of my knowledge it's "invented" by Poul-Henning Kamp, who has a good explanation of why he used that on his homepage (I gave a link to that in my original announcement yesterday).

Without my email (which I don't like posting - I get enough spam already), it looks like this:

/*
* ----------------------------------------------------------------------------
* "THE BEER-WARE LICENSE" (Revision 42):
* <me@domain.dk.invalid> wrote this file. As long as you retain this notice you
* can do whatever you want with this stuff. If we meet some day, and you think
* this stuff is worth it, you can buy me a beer in return Henrik Christian Grove
* ----------------------------------------------------------------------------
*/

(And in my interpretation 'beer' doesn't have to mean 'beer' a drink of some other kind will do just as fine).

**Joe O** · 01-31-2004, 12:26 PM

Originally posted by jjjjL
If you really want to test ECM seriously, you'll have to (re)enable the ability to enter a specific sigma. Without this, the test becomes non-reproducable. Also, I believe some examples of very low numbers (n ~ 100 - 1000) exist which were factored using this method in the factorer thead. There are some numbers in the largefactors table for n just greater than 1000 that I think were done using this method. Talk to Joe_O... it's all his fault

I did find quite a few factors with ECM, but I don't know If I have kept the results pages that give the sigma used to find the factors. I'll look for them.

I just came across (again) a very good, clear description of P-1 and ECM by Phil Moore. It's too long to quote here, so this is the link. It is followed by additional insights from E W Mayer and again by Phil Moore.

**Joe O** · 01-31-2004, 01:19 PM

It would be nice if the following output were directed to results.log or something.txt as well as the console:

Code:

SBFactor v1.2
P-1 and ECM factoring for number of the form k*2^n+1.
Adapted from GIMPS v23.5 by George Woltman and Louis Helm
Intel Pentium III or Pentium III Xeon processor detected.
256MB of memory avilable for stage 2
Finished parsing SoB.dat
1374 numbers between 5553000 =< n < 5600000
Searching for known factors in results.txt...Done.
Removed 5 numbers using the factor files
Testing 1369 numbers between 5553000 =< n < 5600000
Reordering array by n value...Done.
Estimating for k=24737	n=5553031
...
Estimating for k=10223	n=5554085

and the result of each run with the timings.

**jjjjL** · 01-31-2004, 03:47 PM

Originally posted by Joe O
I just came across (again) a very good, clear description of P-1 and ECM by Phil Moore. It's too long to quote here, so this is the link. It is followed by additional insights from E W Mayer and again by Phil Moore.

This thread reminded me of another interesting possibility... something I don't think anyone is currently doing, which is a clustered approach to factoring.

Because of the way stage 2 works, it can be run in an embarasingly parallel way. There is no difference between runing

B1: 100000 followed by B2: 2000000

or

B1: 100000 followed by B2: 1000000
and then B2: 1000000 - 2000000

Currently, whenever a stage 2 operation is started (P-1 or ECM), the starting point is the end point of B1 but there is no reason that this has to be done. In fact, there are already built in provisions in the code to do this parallelizing (search for "undocumented"

).

Now B2 works best with lots of memory so I wouldn't suggest spawning multiple threads on a single machine, but a cluster of slower, high memory systems might be put to good use running parallel pieces of a stage 2 test. Considering memory for older systems is essentially free these days, I imagine many people have (or could have) several high mem 200-500Mhz systems.

A good way to do it would be to have a main "supplier" system that could run just the B1 stage and dump the save file into a network directory along with some assignment files. A cluster of 3 - 6 computers continually read this directory looking for work assignments that aren't taken. When it sees one, it removes the assignment file, and copies the save file, then starts it's part of the stage 2 computation. When complete, it can dump some result file into the directory to show that it finished it's work and then get a new assignment.

This sort of system could also be used for doing deep ECM factoring of some numbers. It wouldn't make the job any more efficient, but maybe it would make it more interesting for the searcher since you could find bigger factors.

Cheers,
Louie

**hc_grove** · 01-31-2004, 04:06 PM

Originally posted by jjjjL
This thread reminded me of another interesting possibility... something I don't think anyone is currently doing, which is a clustered approach to factoring.

Because of the way stage 2 works, it can be run in an embarasingly parallel way. There is no difference between runing

B1: 100000 followed by B2: 2000000

or

B1: 100000 followed by B2: 1000000
and then B2: 1000000 - 2000000

I've actually used in in a small scale, and have considered implementing a 'only do stage1' option. ('Only do stage 2' is unneeded). But I don't really know if this is needed, with the configuration file you start a test find out what B1 is going to be, and tthen stop the test and change the configuration file to use manual bound and B1=B2=<whatever the optimal B1 was>. It's only a little more work than having a configuration option.

This reminds of a minor annoyance: When running with B1=B2 status isn't saved after stage 1 GCD, meaning that it has to be redone if stopped and restarted.

**hc_grove** · 01-31-2004, 05:15 PM

Originally posted by Joe O
It would be nice if the following output were directed to results.log or something.txt as well as the console:

Code:

SBFactor v1.2 P-1 and ECM factoring for number of the form k*2^n+1. Adapted from GIMPS v23.5 by George Woltman and Louis Helm Intel Pentium III or Pentium III Xeon processor detected. 256MB of memory avilable for stage 2 Finished parsing SoB.dat 1374 numbers between 5553000 =< n < 5600000 Searching for known factors in results.txt...Done. Removed 5 numbers using the factor files Testing 1369 numbers between 5553000 =< n < 5600000 Reordering array by n value...Done. Estimating for k=24737 n=5553031 ... Estimating for k=10223 n=5554085

and the result of each run with the timings.

That's not a bad idea. I've made a new version (1.25.4) where all output from the main program (defined as having the code i factor.c) is logged to sbfactor.log. If LogToConsole is set to 1 in the configuration file every is also logged to the console, if set to 0 only the initial presentation and errors are logged to the console. As the initial presentation happens before the configuration file is read, the only alternative is to never print in on the console, and I don't like that idea.

As the function that does the actual work is not part of the main program (according to my above definition) it still only logs to the console. The output from that function includes the result of each run.

So this is not all you wanted, but it's a start.

BTW: There is no myfactor.c in the .tar.gz and .zip anymore (just in case anybody has begun relying on that).

**jjjjL** · 02-01-2004, 12:28 AM

Just a little advice, if you want a cheap way to test the client quickly, drop a file into the directory with sbfactor named badsieve.txt with one prime per line if you want to re-discover a known factor without actually using the full test. It will form an exponent with only the factors in the badsieve text file.

Make sure to delete this file after testing though... I've forgotten this before and had issues.

Cheers,
Louie

**hc_grove** · 02-01-2004, 07:29 AM

Originally posted by jjjjL
Just a little advice, if you want a cheap way to test the client quickly, drop a file into the directory with sbfactor named badsieve.txt with one prime per line if you want to re-discover a known factor without actually using the full test. It will form an exponent with only the factors in the badsieve text file.

Oh yeah, I forgot to say, I removed this feature from my version, because gcc said something I didn't understand. Apparently this was caused by me screwing up copying the code from file to another, taking the code from your factor.c again doesn't seem to cause any problems. I'll reenable this in the next version.

**hc_grove** · 02-01-2004, 09:59 AM

I just tried ECM factoring a few numbers with some extremly small bounds (B1=250, B2=251) to prevent them from taking too much time, and noticed that there's an error in the timing rutine.

Code:

Stage 1 complete. 6073 transforms, 1 modular inverses. Time: 753 secs
Stage 2 init complete. 607 transforms, 1 modular inverses. Time: 1014 secs
Stage 2 complete. 36 transforms, 0 modular inverses. Time: 3221143336 secs
Stage 2 GCD complete. Time: 1136 secs

The reported time for the stage 2 transforms is more than 102 years.

**jjjjL** · 02-02-2004, 09:03 AM

Originally posted by hc_grove
I just tried ECM factoring a few numbers with some extremly small bounds (B1=250, B2=251) to prevent them from taking too much time, and noticed that there's an error in the timing rutine.

Code:

Stage 1 complete. 6073 transforms, 1 modular inverses. Time: 753 secs Stage 2 init complete. 607 transforms, 1 modular inverses. Time: 1014 secs Stage 2 complete. 36 transforms, 0 modular inverses. Time: 3221143336 secs Stage 2 GCD complete. Time: 1136 secs

The reported time for the stage 2 transforms is more than 102 years.

The modular inverse count used to be an unsigned long and was later changed to a float value for the asm. It's only a printf failure. The escape char should be %0.f instead of %lu for the mod inv count and then the time will be ok.

Cheers,
Louie

**hc_grove** · 02-02-2004, 10:32 AM

v1.25.5 is now available!

This one should actually be a tiny bit faster

than v1.2, as it only calculates the optimal bounds once for each number (as opposed to twice).

Other changes:

badsieve.txt now supported again
Fixed reported time for ECM stage 2 (credit goes to Louie who told me what to fix)
Some fixed spelling (more errors surely exists )

**Joe O** · 02-02-2004, 02:06 PM

Originally posted by hc_grove
I've actually used in in a small scale, and have considered implementing a 'only do stage1' option. ('Only do stage 2' is unneeded). But I don't really know if this is needed, with the configuration file you start a test find out what B1 is going to be, and tthen stop the test and change the configuration file to use manual bound and B1=B2=<whatever the optimal B1 was>. It's only a little more work than having a configuration option.

This reminds of a minor annoyance: When running with B1=B2 status isn't saved after stage 1 GCD, meaning that it has to be redone if stopped and restarted.

This may work for you and me, but not everyone wants to bother, so I would like to request that:
1) Save the status after stage1 GCD when b1=b2
2) Provide a configuration file option that would allow stage 1 only - this would not only help those people wanting to do B2 "striping", it would allow people to run this program on more machines. i.e. where stage2 might interfere with the normal operation of the machine, but stage1 is perfectly acceptable.

**Joe O** · 02-02-2004, 05:26 PM

Do we have the equivalent of:

- alternatively one may use the B2min-B2max form, which means that all primes B2min <= p <= B2max should be processed.
Thus specifying only B2 corresponds to B1-B2.

viz

Code:

SBFACTOR.EXE <k> <n> <B1> <B2min-b2max>

Can we run:
sbfactor.exe 21181 5689148 25000 225000-430000

to continue on where

sbfactor.exe 21181 5689148 25000 225000

left off?

Note: This is not quite the same as running:

sbfactor.exe 21181 5689148 25000 225000
followed by
sbfactor.exe 21181 5689148 25000 430000 since this run would be looking at all primes p 25000 <= p <= 430000 and require much more memory than the run looking only for primes p 225000 <= p <= 430000

i.e. will what Louie described work with the current factor programs?
Or do we need a revison/enhancement? This would be very helpfull, since we could run sbfactor.exe 21181 5689148 25000 225000 on one machine and copy the save file to 2 or more machines and run
sbfactor.exe 21181 5689148 25000 225000-430000 on the first

sbfactor.exe 21181 5689148 25000 430000-655000 on the second

sbfactor.exe 21181 5689148 25000 655000-890000 on the third ...

**hc_grove** · 02-02-2004, 05:55 PM

Originally posted by Joe O
Do we have the equivalent of:
viz

Code:

SBFACTOR.EXE <k> <n> <B1> <B2min-b2max>

The current versions don't support it, but the function doing the actual factoring does, so a few minutes of coding should be enough to implement it.

**ceselb** · 04-26-2004, 06:59 PM

Any chance of a windows version with these changes?

**hc_grove** · 04-27-2004, 05:28 AM

Originally posted by ceselb
Any chance of a windows version with these changes?

I don't have a machine with windows, so I can't produce it myself. If somebody could provide SSH access (or something similar) to a windows machine with the neccessary tools I could give it a try. If somebody wants to try compiling it, it should work out of the box, but else I'ld like to help.

**ceselb** · 04-27-2004, 06:11 PM

Rayman over at the Rieselsieve project compiled a version for windows. I'm using it now on two machines (stage 1 on a P4 and stage 2 on a P3 with more memory).

Download [URL=http://www.sslug.dk/~grove/sbfactor/sbfact-win.zip[/URL].

EDIT: changed link to hc_groves site instead.

**hc_grove** · 04-28-2004, 06:06 AM

Originally posted by ceselb
hc_grove: could you download and host this? I'm not going to be online all the time (it's on my home box).

Of course. It's now placed along with all the other files at sbfact-win.zip.

**hc_grove** · 04-28-2004, 04:59 PM

I finally took the time and wrote a page about factoring for Seventeen or Bust, mostly to act as a guide in findng the right files.

Thread: SOURCE CODE RELEASED FOR FACTORER

Thread Tools

Rate This Thread

Display

SOURCE CODE RELEASED FOR FACTORER

Well source + GPU.

Re: Well source + GPU.

Another version of sbfactor

Feature request

Re: Feature request

My version updated again

Posting Permissions