Input wanted on planned new algoritm from users perspective [Archive]

Brian the Fist

01-15-2003, 03:47 PM

The new more complex algorithm is coming along now, and I wanted to get some input so we can get any concerns out in the open now while its still being built. As currently planned, here are the major changes on the user's side that you would experience:

- resumes where it left off when you quit - it will only upload after a batch of 5000 thus and not everytime you start/quit
- upload will still be approx. same size as it is now
- you can quit any time immediately
- first batch will be 5000 (you have no control, but see above)
- following batches will be smaller (maybe 200) but take longer to build
- henceforth we refer to a 'batch' as a generation
- more points will be awarded for later generation structures to encourage people to get there rather than sticking at generation 0 which is faster to generate - roughly take the sqrt of the gen # to get the scale factor for the points
- each generation will build from the best structure of the previous generation (i.e. the method is iterative)
- each CPU/instance will run its own set of independent generations etc. and you 'best RMSD' will be the best from any generation on any CPU/instance, but the latest generation best structure from each CPU/instance will be stored by us
- when generation 50 is reached, it will restart at zero again
- server can make cool 'folding movies' from the 50 generations which you can then download and watch
- as a side effect, each machine will need to be uniquely identified so it will be not be easy to, say, generate on one machine and then upload from another - this is necessary to avoid other nasty potential problems but should not affect people you use proxy servers or firewalls, only physically moving around data will cause trouble.
- you will at most be able to buffer 50 generations (about 2 days work (maybe)) but this could change

Ill add more stuff as I think of it. Comments and complaints are welcome. Keep in mind some concessions will be necessary to get this new more complicated algorithm to work, which may include alienating some users with special needs.

EDIT:

Actually, I think we can do it without those last two points, so scratch those off

tpdooley

01-15-2003, 04:40 PM

I can see a few things that might need working out.
At work, the first system to download and start running the new client first gets stopped, send all the tiny number of folds back, and then the directory is copied to C:\Folding2. I then stop the other machines, and copy this to their drive, kill c:\folding and rename this to c:\folding.
(If I ever start running large numbers of systems again, I know I can try and get the proxy server to work - but for 2-3 machines I have on for the whole protein - that equal the 10 I was running, I'll refrain from trying to figure out what was wrong with the proxy server setup.)
When I get new systems to test out - (Brand new Dells or slightly older 256Meg+ machines) I'll copy this same directory on new systems and run it overnight (or 3 weeks, if they take forever to pick it up).
----
So it would be nice if the program doesn't get hard coded for the machine it was installed on.
----
At least one member of my team is using a computer at his wive's work that isn't connected to the internet. Once a week, he'll stop by to get the filelist & .bz2 files, (delete and reinstall the program) take them home, copy them into a blank folding directory - and run the copy of DF in that directory to upload the folds.
Others have mentioned using "sneakernet" approaches on a weekly basis.
----
Will there be a way to upload sneakernetted folds, and over a longer period than 2 days for those machines not connected to the internet? (or will we have to start bringing them back seed information to get them to start up again after the first "generation 50" sequence is finished?)

KWSN_robegeor

01-15-2003, 07:52 PM

" - as a side effect, each machine will need to be uniquely identified so it will be not be easy to, say, generate on one machine and then upload from another - this is necessary to avoid other nasty potential problems but should not affect people you use proxy servers or firewalls, only physically moving around data will cause trouble.
- you will at most be able to buffer 50 generations (about 2 days work (maybe)) but this could change"

This could drastically reduce my output.

The first item wouldn't be a huge deal AS LONG AS one can copy the whole DF folder from one machine (say a non-internet enabled server) to an internet enabled PC and successfully upload the results. Make sense? But, if one can only upload from the machine that produced the results, then I am screwed.

The second item (2 days / 50 generations max before needing to upload) will hurt. If I need to harvest -nonetted results every two days, then I will need to find a new project. I don't have time for that.

Upwards of 10Ghz (and more at times) is only available to me in a -nonet type setup.

rg

FoBoT

01-15-2003, 07:56 PM

i am not a complainer

what ever is best for the project is fine

with that said, if the work can't be moved (easily or hard) from the original machine to another to send, it may impact my ability to participate :cry:

but i can be very creative and am not afraid of obsticles , so again, whatever is best to get the job done (see sig ;) )

my situation really lends itself to projects that allow/accomidate sneakernetting :/

more later

vsemaska

01-15-2003, 08:53 PM

Originally posted by Brian the Fist
- as a side effect, each machine will need to be uniquely identified so it will be not be easy to, say, generate on one machine and then upload from another - this is necessary to avoid other nasty potential problems but should not affect people you use proxy servers or firewalls, only physically moving around data will cause trouble.
- you will at most be able to buffer 50 generations (about 2 days work (maybe)) but this could change

Ill add more stuff as I think of it. Comments and complaints are welcome. Keep in mind some concessions will be necessary to get this new more complicated algorithm to work, which may include alienating some users with special needs.

EDIT:

Actually, I think we can do it without those last two points, so scratch those off

According to the 'EDIT:' line it sounds like Howard decided that unique IDs and 50 generations limit won't be needed.

I too 'sneakernet' numerous systems so Howard I think you should keep this in mind since it sounds like a lot of people here do this.

Vic

Insidious

01-15-2003, 09:22 PM

If I have understood this thread correctly, sneakernetting WILL be possible. and you AREN'T planning on limiting us to 2 days worth of work between connections. Is this right?... If so, I don't see much difficulty with the changes.

I think the loss of participants would be dramatic without the above.

I also am wondering how the new scoring system will correlate to the scoring system that is in place at present. Is this new system going to make it impossible to close large gaps in scores created when the clients were fast? (ie: will newcommers have no chance of ever catching up?) That might cause a bit of heartburn for up & commers and/or discourage new participants from joining up.

Finally, I have been watching some 'new-commer' projects commit suicide with premature releases of systems that are unstable and/or disfunctional causing loss of work and/or inability to get new work as needed.

I am begging you..... please don't let this happen to DF!!!!!!!

-Sid

vsemaska

01-15-2003, 09:32 PM

Howard,

Since this new algorithm is such a dramatic change have you considered doing a beta release to a small no. of participants. Let everyone else keep running with the current software until the new one seems stable. I'd be willing to test the new software on the platform I use.

Vic

Brian the Fist

01-15-2003, 10:53 PM

Originally posted by vsemaska
Howard,

Since this new algorithm is such a dramatic change have you considered doing a beta release to a small no. of participants. Let everyone else keep running with the current software until the new one seems stable. I'd be willing to test the new software on the platform I use.

Vic

Actually, that is exactly what we intend to do. Although based on the dismal turnout for the screensaver 'beta test', i don't know how well it will work. :(

Anyhow yes, we should be able to get the sneakernet working fine.

Also I forgot to mention, stats on Team pages will also list total for current protein and overall so newcomers will still have something to show off (but top 10 will still be overall - I might add a top 10 for current protein as well though, we'll see).

The scoring will be as fair as possible and have thought of and will implement ways to discourage people from trying to, for example, repeatedly generate and upload generation 49 structures (which are worth more). You could keep generating generation 0 structures (which are faster to generate that all others) but this is why you'll get less points for them - maybe even none - they are just the 'stepping stone' required to get into generation one where the real work begins. Generation 0 structures are like what you're making right now.

Actually I like that. How would people feel if you got zero points for gen. 0 and then got scaled points for future generations as mentioned above? It takes only about 1/2hr-2 hrs. to make the first 5000 for most people, and then days to make the rest, so you wouldn't get credit for that couple hours work, but it would be required to reach gen. 1 where you DO start getting credit. I think this is fair and will encourage people to get up high in the generations. Ok, where are you stats-ho's? Does this sound reasonable?

KWSN_robegeor

01-16-2003, 12:08 AM

I think I qualify as a stat-ho (at least I have been called such).

That said, I am fine with the plan.

Insidious does bring up a good point though. Do you see a big difference in overall scoring using the new algorithm? If there is going to be a big difference, I am not opposed to zeroing out the stats and starting over. Phase 1, Phase 2, etc. Afterall, the purpose is to the refine the algorithm.

m0ti

01-16-2003, 02:29 AM

Howard,

I think you may want to have gen 0 worth some points, perhaps just to get new people started.

I don't know maybe 10 points or something?

Just a question, how much of a difference are we talking in terms of folding time? From your suggested scoring it seems that gen 2 folds take 4 times as long as gen 1 folds. In which case the gen 0 folds would be relatively instant to the rest of them.

Or perhaps you want to reward people more for folds of more advanced generations, and the scoring is not representative of the times?

In any case, I think the ability to work around the last two items is crucial.

And I would definitely be willing to help out in a beta for the new algorithm. If we're producing better results than I couldn't be happier.

pointwood

01-16-2003, 03:07 AM

Originally posted by Insidious
I also am wondering how the new scoring system will correlate to the scoring system that is in place at present. Is this new system going to make it impossible to close large gaps in scores created when the clients were fast? (ie: will newcommers have no chance of ever catching up?) That might cause a bit of heartburn for up & commers and/or discourage new participants from joining up.

-Sid Wouldn't that make it unfair to the teams that have a lead? How would you make it? Besides, all you need is to win in the daily stats and eventually, you'll be #1 ;)

The only possible solution I can see, would be to close this "version", announce a winner and start from scratch.

MAD-ness

01-16-2003, 03:41 AM

The beta-testing of the screen saver was sort of odd. Needing hard core, early adopters, but testing something they wouldn't use. ;)

If there is an open beta for the new algorithm, I will do what I can to publicize it on the TSF page and over at Ars to get some technically knowledgeable and thorough testers involved. I am sure the others teams would do so as well at thier respective homes/teams.

The last two items are the big problems, as I see it. However, that is why you brought this up now - so we can point out problems or circumstances that you have no reason to have experience with or to have thought of.

pointwood

01-16-2003, 03:52 AM

Originally posted by Brian the Fist
Actually, that is exactly what we intend to do. Although based on the dismal turnout for the screensaver 'beta test', i don't know how well it will work. :( I think that's primarily because most people have no interest in the screensaver. I have very briefly tried it on 2 machines and didn't have any problems. I bet you'll see a very different kind of interest in a beta release of a new console client.

Anyhow yes, we should be able to get the sneakernet working fine. That I think is a very important thing. Having that feature and mking it as easy as possible is critical to many IMHO.

Also I forgot to mention, stats on Team pages will also list total for current protein and overall so newcomers will still have something to show off (but top 10 will still be overall - I might add a top 10 for current protein as well though, we'll see).
More stats are always welcome :cool:

The scoring will be as fair as possible and have thought of and will implement ways to discourage people from trying to, for example, repeatedly generate and upload generation 49 structures (which are worth more). You could keep generating generation 0 structures (which are faster to generate that all others) but this is why you'll get less points for them - maybe even none - they are just the 'stepping stone' required to get into generation one where the real work begins. Generation 0 structures are like what you're making right now.

Actually I like that. How would people feel if you got zero points for gen. 0 and then got scaled points for future generations as mentioned above? It takes only about 1/2hr-2 hrs. to make the first 5000 for most people, and then days to make the rest, so you wouldn't get credit for that couple hours work, but it would be required to reach gen. 1 where you DO start getting credit. I think this is fair and will encourage people to get up high in the generations. Ok, where are you stats-ho's? Does this sound reasonable? I have no problem with that. As long as it is equal to everyone I don't see any problems with that.

Come to think about it - could you make it an option to make that a benchmark? It would be cool if the client had a standard way to make a benchmark. Or maybe if the client had a little benchmark protein (which should, of course, always be the same) and you then made it possible to generate x numbers of structures based on that protein with "foldtrajlite.exe -benchmark" or something?

That would make it much easier to compare CPU's, OS'es, etc. :)

EDIT: Of course, the client shouldn't submit any data to the server when using the benchmark protein :)

FoBoT

01-16-2003, 06:19 AM

Originally posted by Brian the Fist
so you wouldn't get credit for that couple hours work, but it would be required to reach gen. 1 where you DO start getting credit. I think this is fair and will encourage people to get up high in the generations. Ok, where are you stats-ho's? Does this sound reasonable?

well, i am only a moderate stats whore :rolleyes:

but i think that might be a good way to handle it

m0ti

01-16-2003, 06:20 AM

I like that benchmark idea! :rotfl:

pointwood

01-16-2003, 06:38 AM

Originally posted by FoBoT
well, i am only a moderate stats whore :rolleyes: I have a very difficult time believing that FoBot :rotfl:

TheOtherPhil

01-16-2003, 08:11 AM

If there is going to be a change in the way work is "scored", I believe the only fair thing to do is re-zero all the team stats and start again from phase 2. Announce the winners for phase 1 (top 100 teams, top 1000 users etc) and start afresh. This way, everybody will be starting on a level playing field and new users/ teams will feel that they have a chance to compete.

I remember when stanford changed the scoring for G@H work units and it upset a lot of members.

Digital Parasite

01-16-2003, 08:13 AM

Originally posted by Brian the Fist
Actually, that is exactly what we intend to do. Although based on the dismal turnout for the screensaver 'beta test', i don't know how well it will work. :(

As other people said, most of the hardcore people don't run the screensaver so they didn't want to bother trying to install it and test it. I will pre-volunteer for beta testing the new version (especially since I might have to make changes to dfGUI to accomodate it). You can sign me up now. :thumbs:

What you have mentioned so far including your suggestion for stats sounds fine to me. As long as in your original post the last 2 points making it more difficult to copy to machines and nonetting are scratched everything else looks good. I especially like that you will be posting totals and current protein stats as well.

Looking forward to this new version.

Jeff.

MAD-ness

01-16-2003, 08:15 AM

OK m0ti, my attempts to google for it and to look at your team's web site didn't lead me anywhere so I guess I will just ask -

What is dfQ?

Scoofy12

01-16-2003, 08:19 AM

I don't really think it will be necessary to zero out all the stats. after all, the relative sizes of the proteins have been changing with every update anyway, so each has been effectively worth a different number of points. i think it is fair to assign points to each generation relative to the time it takes to generate, maybe normalized to a recent protein and leave it at that. i think resetting the stats altogether would be much more disruptive than just continuing, especially since there is already the precedent that not all proteins are equal anyway.

FoBoT

01-16-2003, 08:28 AM

Originally posted by TheOtherPhil
If there is going to be a change in the way work is "scored", I believe the only fair thing to do is re-zero all the team stats and start again from phase 2. Announce the winners for phase 1 (top 100 teams, top 1000 users etc) and start afresh. This way, everybody will be starting on a level playing field and new users/ teams will feel that they have a chance to compete.

I remember when stanford changed the scoring for G@H work units and it upset a lot of members.

if the scoring was frozen and restarted as a new phase, it would make deciding on the new scoring much easier, since the whole angle of trying to keep it on par with the old stuff wouldn't matter

hmmm, that is something to consider

FoBoT

01-16-2003, 08:36 AM

Originally posted by MAD-ness
OK m0ti, my attempts to google for it and to look at your team's web site didn't lead me anywhere so I guess I will just ask -

What is dfQ?

it is still in the pre-alpha stage i believe

his plan is to write a program similar to SETI Que
a third party cacheing proxy so that people with many PC's that aren't internet connected (the sneakernet crowd) can have thier non-internet PC's connect to a single point/proxy to send/recieve work. that single dfQ/proxy machine would then be the only machine that would be required to connect (somehow) to the internet/back to DF HQ (Howard) to send results back to the project

m0ti

01-16-2003, 09:04 AM

Yes, FoBoT, thanks for the answer. It will allow a drastic cut-down in sneaker-netting. If you've got a single machine connected to the internet in a lan then you won't have to sneaker-net. Otherwise, you can collect all of the results (in a single file) from a single machine, transfer them to a machine with an internet connection and upload them.

It's been an off and on project for a while, depending on how utterly insane my life is. :D

It's pre-alpha and I haven't worked on it in a while since I'm trying to complete some crunching (about 2 - 3 days worth after some optimizing... and then some more optimizing...) which I then have to analyze for a paper I want to put out (non-DF related).

In the meantime, there's always dfDetect (http://t2.technion.ac.il/~sm0ti/dfDetect.zip) which is a little Win32 utility I made (by team-mate request's) that does some worthwhile stuff:

- works for DF installed as a service (or multiple services) or CLI.
- make sure DF is always running
- run DF in completely hidden mode (useful for Win9x users)
- restart DF after X minutes have passed
- stop DF/keep DF from running while program X is running (great for corporate farmers with CAD machines/gamers).

In any case, I'll probably be back doing the dfQ thing soon (I hope). February perhaps will see a relase (I was originally hoping for January, but, life got in the way ;)).

Brian the Fist

01-16-2003, 09:31 AM

Note that your dfQ may have to be changed a bit when the new algorithm is released.

Anyhow, Im going to hold you all to those beta testing promises!

Now it appears there's 2 camps. Some say reset the stats to zero, some say leave them. Personally, I don't care about stats as you know so I'll do whatever sounds most reasonable.

To help you decide, a bit more about the 'new' scoring. Each generation should on average take about the same time to generate (except the gen. 0 which is fast and gets zero points anyways). The time will still increase with protein size but not as much necessarily. You will get higher points for later generations (gen. x ponts = sqrt(x) * gen. 1 points) to encourage and reward you for running it long enough to get to higher generations.
We will set the points for gen.1 to roughly correspond to the number of structures of length 100-150 that you could generate with the present method in the average expected time it will take to finish gen.1 (which should be roughly CPU independent if you read that sentence carefully). This number is yet to be determined and depends on the final generation size we choose but figure around 1000-10000.

So opinions? Reset stats to zero or keep 'em? I'm especially interested in hearing from people in the top 10 on this, who might be the most pissed off if stats are zeroed...

KWSN_Millennium2001Guy

01-16-2003, 10:36 AM

A top ten-ner checking in.

1. I am okay with resetting the stats and starting the competition from zero on Phase II since it IS a fundamentally new algorithm. If handled well this could be an opportunity to generate an influx of new participants.

2. I am also okay with leaving the stats as they are now and just adding the new stats to the totals already generated. Since the stats calculations are changing for everybody at the same time it is essentially the same as the fast protein/slow protein differences that we have seen up to this point.

3. My gut feel is that resetting the stats to zero and calling it DF phase II would be the most fair to all concerned. I am excited that we are trying new algorithms in an attempt to improve the quality of the science.

.. a quick note to those who are wondering why my account isn't generating the huge numbers that it once did... be assured that my machines are still contributing just as much as they always have to DF, they are just going into different buckets.

also... lemonsqzz, read your pm.

Ni! :cool: :p :cheers:

FoBoT

01-16-2003, 11:07 AM

Originally posted by Brian the Fist

So opinions? Reset stats to zero or keep 'em? I'm especially interested in hearing from people in the top 10 on this, who might be the most pissed off if stats are zeroed...

well, another way to look at this, is not who you will loose , but who you won't gain

ie

if you reset the stats, you take the risk of pissing off current crunchers, that is, you may drop in active participants

if you leave the stats the same, but new people have less chance of catching up, due to differences in the new scoring under the new algorithm, then it is likely that you keep the majority of the old crunchers, but maybe new people look at the numbers and decide to pass on joining DF, because they feel they have no chance to move up

what i mean is, i think there is more short term down side risk to resetting the stats than to keeping them and adding to them with a new scoring system under the new algorithm/system

so the "safe" bet is to leave things alone as much as possible, but by doing this, you may be limiting the future potential gains of an influx of new peoples

however, i am not afraid of a shake up myself, so http://www.icalledit.org/forums/images/smilies/trommel.gif

Scoofy12

01-16-2003, 11:14 AM

This may have been mentioned, but it doesn't seem like it would be difficult to keep two sets of stats, a phase 2 set, and an overall set. as to which one would be more "official" or "valid" could be left up to debate :)

grobinette

01-16-2003, 12:04 PM

I don't have a problem with re-setting to zero. It evens the playing field for all teams and team members so there is a big plus for that if you were a latecomer to the program.

If it makes it easier for your purposes, then by all means simplify it and zero it out.

ulv

01-16-2003, 12:06 PM

I'll go for two sets of stats, overall and phase 2, IF that's OK with the statsgurus.

bunker

01-16-2003, 12:24 PM

The new algorithim sounds great. I'm all for zeroing out the stats. We have a few inactives with a larger total than me that I could catch a lot quicker! :D

KWSN_robegeor

01-16-2003, 12:35 PM

top ten-er (barely...and probably for not much longer---there is some serious horsepower closing on me quickly!)

I am fine with any of options: Reset to zero, continue on, or start a new set of stats. It's all good. I just appeciate the openess and dialog.

Welnic

01-16-2003, 12:52 PM

I think that either zeroing the stats and starting a new phase is fine, or continuing on with current stats. The WUs generated so far have had enough variety in how long they took to make that if the new system is off by some it doesn't matter that much.

Around one year seems to be a reasonble length for a phase of this project. If you think that there will be more future big changes like this you should probably start a new phase. The next time it seems like it would be harder to zero the stats.

As I already spend too much time checking the stats I think having total and phase stats would be a little too much.

Finding beta testers for a new real client should be way easier than a windows screensaver.

KWSN Grim Reaper

01-16-2003, 01:11 PM

I personally don't have a problem with zeroing out the stats and starting over, but I really do fear what it would do to the project. People who haven't been running the project from the begining would feel that their month or two of work has been a waste of time, and those who are obsessed with stats...well, I don't even want to think about that:(

As has already been said, we've changed protiens, tweaked the client to produce over twice as many structures in the same amount of time, and made other changes, and none of them have warranted any changes to the stats. Each protien has varied in the amount of individual production, the new client should be no different, as long as everyone has to do the same thing. Say, no credit unless you finish the entire run.

Ni!

Paratima

01-16-2003, 01:11 PM

So, after a year (!), I FINALLY make it into the top 20 (:D ) and you want to flush it all away!!!

Well, flush 'em. I was considering starting over from zero with a different name, anyway. :p

The journey is way more fun than the destination.

Robor

01-16-2003, 01:14 PM

I'm rather new to the DF project but IMO what should be done is what's best for the project. I have to admit that I'm a stats whore but I understand the goal of the project is more important than the stats.

Sooo... I'm fine with resetting them to zero, keeping dual stats, or modifying the stats once the new (slower) program is in place. It does sound like the easiest thing would be to reset the stats and start from scratch though.

vsemaska

01-16-2003, 01:22 PM

1) Don't worry about finding beta testers. The dismal response to the ScreenSaver was because most of us are hardcore crunchers and had no interest in it. A new algorithm that may produce better science is a totally different story. :D

2) My vote is to zero the stats. When F@H went from V1 to V2 they didn't carry over the stats and there were lots of complaints. Gradually all that died down and it was back to business as usual.

Only suggestion I'd make is keep the Phase I stats around somewhere in case people want to see them. Maybe www.statsman.org could do something.

Vic

OgreChow

01-16-2003, 01:24 PM

I am all for resetting the stats - the people at the top got there by having the best systems and most time commitment - this is something that will not change and they will have no problem staying on the top. Besides, doesn't it get boring being so far ahead of the competition?

The people at the bottom will be happy to have a chance to catch up, and to get a more accurate idea of how they stack up.

I would definitely like to see more short-term stats in the future, such as highest ranked per protien, per week, per day, etc. This would let new-comers feel an immediate impact.

-OgreChow

Louis

01-16-2003, 02:27 PM

I'm not sure I fit in the category of stats-ho or not, but.... After all, I did set up a one-person team because being 198 is better than being 1000+.

My druthers would be to reset the stats. Most of the folks at the top got there with some mad horsepower, and they'll probably jump out in front again pretty quick. Plus resetting the numbers would get the inactive folks off the list, which might be a positive to some folks (those obsessed with stats).

It also seems that the new algorithm results might not be suited for a straight 'results returned' number, but more of a points scenario. On the united devices Think projects, the results are tracked by CPU time, results returned, and points based on power, etc. Something similar might work well - consider the sets returned at each generation as a results set, and count them for each generation. Then have a points calculation based on generation, time to complete, CPU speed, kilowatts used, or whatever.

But hey, as long as I can watch the pretty pictures every once and awhile... :crazy:

bwkaz

01-16-2003, 02:38 PM

Yeah, the pretty ASCII pictures are all I want to see, really. :D

If you couldn't tell, I don't care one way or the other what you do with resetting. If you reset them, then it might be a better indication of how we're doing with respect to who's running the client at that instant, but that kind of info will go out of date really fast anyway. If you don't reset them, meh, status quo.

*shrug* Do whatever. ;)

MAD-ness

01-16-2003, 03:10 PM

It seems obvious that those with a lead would be the ones with the greatest motivation to remain the status quo.

In my opinion, as a member of TSF, if the stats were zeroed it would be real nice to see a page where the final rankings of "Phase 1" were recording. Not only for TSF but for the other teams who have worked so hard to achieve thier rankings and the same goes for individual rankings.

That said, I believe that having a fresh start (but carrying the current momentum of the project as well as many of the more robust features and stats that have developed over the last year) could be a GREAT boon for the project, especially encouraging those who are intimidated by the incredibly high scores recorded by some teams and individuals.

If you put the amount of logical thought into this situation as you have most of the other situations I don't foresee a large problem.

frozenchosen

01-16-2003, 03:24 PM

I like zeroing out the stats. Make a page that tells us who the winners were for phase 1 so that we can all go and pay our respects and admire their fine accomplishments. Maybe even make it a hall of fame with statues and stuff. Then let's get on with the new stuff. The thing that is most interesting about this project is that we are assisting in validating NEW algorithms that are becoming more sophisticated. I like the idea that my contribution, small as it is, is unique and is not being duplicated by five or six other people around the world.

Brian the Fist

01-16-2003, 03:32 PM

As a side benefit, I can avoid any Int4 overflows I haven't already caught as well! :smoking:
Ok, well sounds almost unanimous that zeroing the stats is a good idea, but also keeping a page with final Phase I stats available for viewing (just top 10, or all the final team pages too??).

Keep in mind this won't be happening for a while still though. I'd expect it to be ready for beta testing in a couple weeks at the earliest.

Aegion

01-16-2003, 03:38 PM

Originally posted by Brian the Fist
As a side benefit, I can avoid any Int4 overflows I haven't already caught as well! :smoking:
Ok, well sounds almost unanimous that zeroing the stats is a good idea, but also keeping a page with final Phase I stats available for viewing (just top 10, or all the final team pages too??).

Keep in mind this won't be happening for a while still though. I'd expect it to be ready for beta testing in a couple weeks at the earliest.
I would strongly advise making it everyone. The final rankings will be an important piece of information for many people.

AMD_is_logical

01-16-2003, 03:39 PM

On sneakernetting:
It looks like what you really want are complete sets of 50 generations. Could you make it so that a script could easily identify completed sets on a no-netting client, and then move the files for those sets out of the clients directory? Right now I need to shut down the client (by deleting the .lock file), because one of the .bz2 files is in use when the client is running.

With the current client, I can just remove all work and upload it later. With the new client I will only want to remove completed sets, so I will need a way for my script to remove only those sets, and not the partly completed set that the client is working on.

Also, the client should be able to handle power failures. Currently the client tends to leave an orphaned .bz2 file sitting around.

On stats: I vote for keeping the current stats. I think, however, that the new client should be normalized so as to produce somewhat higher numbers (for a given amount of CPU-hours) than the old client did. This would reduce the lead of the older teams, (in terms of CPU-hours), but in a positive way (Hey, look at the great production I'm getting), rather than in a negative way (What the ... all those months of hard work ... GONE).

Also, for a new team, stomping ones way up the ladder is where the fun is. Reseting the stats ruins that fun.

vsemaska

01-16-2003, 03:48 PM

Originally posted by MAD-ness
It seems obvious that those with a lead would be the ones with the greatest motivation to remain the status quo.

I don't see why. If they have the CPU horsepower to pull out ahead, they'll just do it again when the stats are zeroed.

Vic

tpdooley

01-16-2003, 03:54 PM

When I started folding in June, I joined a team with about 30% inactive accounts. After running through many of the inactive accounts, and those that only folded for a few hours a day - I noticed that I needed more horsepower to catch up to those at the top. And I'll have a higher score than the person currently in first place on that team within a month or two.
For some of us, having a long list of those to overcome in the score race is a wonderful challenge and really got us hooked.

m0ti

01-16-2003, 04:00 PM

Not being much of a stats-ho myself (and I'm the one putting out all the stats for my team. What I lack in terms of statslust I more than make up for in team spirit!), it isn't too much of an issue for me.

However, if the stats are reset, make sure to maintain the stats for everybody (perhaps nicely organized ala Dyyryath's or statsman, etc) so that nobody feels that their contribution to phase I of the project wasn't "important" enough to warrant it appearing in the stats for the phase.

Kibosh

01-16-2003, 05:09 PM

I have two thoughts on the whole process.

first, if you don't count for the first 5000 structures I am going to lose out a lot more than some people since I run a fleet of K6-2s that take a good long time per structure. (unless of course they are generated so fast it doesnt' matter). If it takes multiple hours on a P3 or P4 though then my K6-2s will be severely handicapped since they will take much longer to get to the "counting stage." And yes, I am a stats whore... if the project doesn't have stats then I don't run it.

second, I think the stats should be zeroed or you will get people upset. If you declare that the project is ending (be sure to give a date far enough in advance that people can ramp up and try to "win") and then declare a winner that seems like the best option. That way, nobody can get pissed off since the stats changed. You can keep the usernames and teams and everything but start a new project (DF2 or some such)...

Just my thoughts.

Kibosh
aka SphincterLord

FoBoT

01-16-2003, 06:08 PM

Originally posted by Aegion
I would strongly advise making it everyone. The final rankings will be an important piece of information for many people.

yes, it needs to include everyone that participated.

remember how many times the little guy has had the best RMS? that is one of the things about this project that is so great, truly ANYONE can be the one that finds the needle in the haystack, and with DF , you can actually see this in action

Dyyryath

01-16-2003, 06:19 PM

I don't mind the thought of resetting the stats. It sounds like the project is legtimately moving into a different 'phase' anyway. :thumbs:

The new algorithm sounds good, though I probably won't have anything constructive to add about the scoring until I see it in action. That said, I'd LOVE to beta test a new client. I've got all kinds of different machines/architectures to test on as well.

The benchmark idea mentioned above is also a good one.

I think the addition of a 'current protein' total to the project stats is going to be a good thing. reader50 over at Team MacNN has already been trying to track this and it's really a pretty neat feature.

I considered trying to track it myself, but decided that even the small amount of inaccuracy caused by the overlap time was annoying enough to me that I didn't want to bother with it.

Will you be presenting the format for the new stats output when you have people running the beta client? It'd be nice for those of us building third party stats to have a little lead in time with the new output before you officially move people to the new client.

cygnussphere

01-16-2003, 08:25 PM

Looks like its time for a opinion pole?

Spankin Partier

01-16-2003, 08:31 PM

As a member of a farly new team that's still rising through the ranks (currently ranked 18th), I like to see the stats remain in place. By resetting the stats, the passing that occurs currently will stop as current production will dictate the new positions. Our team has traveled through various DC projects (Seti, Genome, and Folding) and always celibrated the passing of a team. We've also enjoyed the challage of the occational upcoming team that threatens our position. By reseting the stats, things will become very static. But if they are not reset, I would hope that the new points per CPU hour will be approximatly equal to the original points per CPU hour. This wasn't considered when the Seti project switched to ver 3 of thier client. A lot of members were lost becouse of that. :(

On a different note, most of my farm is located at work. I have these machine set up to run only after hours. If I stop a client after say the 20th generation, will it continue again on the same generation?

Thanks,
Have Fun! :D

prokaryote

01-16-2003, 08:43 PM

If you're going to zero the stats for phase II of the project, may want to consider the team jumping issue as well then? Or not.

lemonsqzz

01-16-2003, 09:23 PM

Yeah.. you can strip me of my billions.. :swear:

I'm still rich in friends here... It'll be fun to do all over again... and lets try to clear out the zero-ever producers from the systen...

I think the reward should be based on CPU time somehow.. then everybody is equal... points scaled to the speed of the CPU that crunched them.. more for the slower onces since that is more painful to run.. I'll go with the flow though..

Insidious

01-16-2003, 09:32 PM

I think DF Phase II is a great idea. I would like to suggest you keep the PhaseI results intact and viewable.

I think there might be a feeling of futility created if all those months of work just vanished. The obvious thought would be, so
are Phase II worthless also?

So you have one more hat in the ring for zero points to begin the
next DF phase.

-Sid

Spankinmonkee

01-16-2003, 09:49 PM

I'm open for the good of the project as long as it will let me continue to nonet..if not Spankies toast as I run all my systems @home here and all but one are full time noneters :cry:
Also will there be any change as far as Cpu load stress over the current algorithm

Spankie :spank:

Tawcan

01-16-2003, 09:53 PM

First of all, I got to say this project sure is user oriented. I like it. :thumbs: (You know what I mean if you've ran Folding/Genome before).

Could someone explain how exactly the generation 50 works? From what I've read it seems like you need to keep systems running for a long time in order to reach generation 50? :confused: If so, it doesn't seem to be fair to those who aren't crunching 24/7. I like the resume idea, but does this mean you resume the generation work as well? :confused:

So basically you crunch a protein but you can get different generations? :confused:

As for nonet. It would be nice to have similiar nonet ability as the current client. It's easier for those us who can't connect to the net all the time or have limited access to internet.

As for stats, either way works for me. I'm on the same team as Logical and Spankin Partier. We're a pretty new team and we're currently #6 production I think. Most of our fun came from spankin errr passing :p teams. Like SP mentioned, if the stats reset it might not be as fun for our team. Resetting the stats would be good in the sense everyone starts fresh and new teams will have a chance to pass some older teams.

If the stats were to stay I would like the scoring system to be approximately the same as now in terms of CPU power vs time.

jaydee116

01-16-2003, 10:01 PM

I am also apart of the Killer Barbarian Frogs. I started DF about a month ago and have been climbing in my own team ranks and we have been climbing in the overal teams pretty steadily. Still I am not opposed to starting fresh. It all has to end sometime. We can make this round a learning one and see what happens so we don't make the same mistakes next time. If we do keep going with the current stats our team can enjoy the battles upcoming, but still we will reach are peak sooner or later. Not zeroing is just putting it off a little. Anyone who thinks their progress would be wasted by starting the stats over should remember what the project is really about. I joined DF because I felt SETI was becoming quite useless with the redundant units being so high. If I ever feel this project is a waste of my resources I will move on. Stats just makes it more fun. :D

MAD-ness

01-17-2003, 03:02 AM

It is good to see some of the Killer Barbarian Frogs posting here on the forum. :)

Welcome guys.

I think that it is VERY important to have a stats system that is accurate, consistent and reflects the processing power contributed.

Projects that do goofy stuff (like uh, I think it might be UD) or reward based on some sort of 'curve' are a major turn off for many of the more hard-core DC people.

If people want to compare different CPU architectures, they can do the math themslves, comparing the units/time ratios. Trying to build this type of stuff into the stats makes things overly complicated and less accurate, IMO.

m0ti

01-17-2003, 03:39 AM

I agree with MAD-ness.

Stats should be kept respective of the number of folds produced and not the amount of CPU time utilized. That would be a separate stat I think people would be interested in (and how long on average per fold... sort of like SETI and their total crunching time, average time per WU thing).

Actually, after giving it some more thought I think that ending off phase I, saving all the results to be viewed in a convenient way, and resetting for phase II could be a very good idea. I think very few people will leave DF as a result, particularly due to the close attention paid to what the users want, and a lot of the users are voting to move on to phase II in this thread. Plus, it can be a boost for recruitment: come in and join up while the project's starting and you can climb high and fast. Should be of particular interest to anybody who's remotely interested in stats (everybody likes to be in the top X overall ASAP).

Thanks for the time and effort to listen to your community, Howard!

Michael H.W. Weber

01-17-2003, 04:53 AM

1. Our team will have a big problem if it would no longer be possible to move the results from one computer to another for upload. Similar to FoBot, we have quite a couple of machines (if not most) that don't have a network connection at all. I believe that this project will run into general problems if upload is bound to the computer on which the results have been generated (and participation in this project is comparably low, anyway).

2. I don't think that it would be a nice idea to delete the present stats (if this is considered at all). This would look as if we never contributed to this project. I have no problem with creation of a new stats system as suggested - although getting zero credit for initial generation is NOT acceptable (I pay the electricity bill and I want to see where the money has gone). However, the old stats must at least be kept somewhere for the record.

3. Any (preliminary) news on the CASP results? I wonder why such drastic changes of the algorithm are undertaken (not satisfied with the current results?).

Michael.

Vato

01-17-2003, 05:12 AM

Originally posted by Brian the Fist
Ok, well sounds almost unanimous that zeroing the stats is a good idea, but also keeping a page with final Phase I stats available for viewing (just top 10, or all the final team pages too??).

Keep a snapshot of the position and points for every team and user.
Everyones contribution is worthwhile, and what's the cost of it?
It certainly avoids the project looking like NEO.

HaloJones

01-17-2003, 06:06 AM

As a nearly top-tenner, my immediate reaction is :shocked:

If we start again, my production will put me around the #10 slot from where I will never deviate. Wow, that'll be exciting. Maybe I could turn off all the systems every couple of weeks to let #11 pass me and then blast past again. Yup, that'll be just grand.

I have goals at the moment. I want to pass you Brian and some others up there. I want to try to fight off the guys coming up behind me. If you zero the stats, the only movement will be from people who like to horde and dump.

Every protein has been different. Some people have only crunched for DF on the fast proteins and skipped the slow ones; personally, I have crunched every protein irrespective of its speed. So the scoring of the new algorithm will be different - why is this time so different that we have to have new stats? Why wasn't this done for each new protein then?!

Score the new system in a similar way to any of the proteins we've been working on and no-one should have any reason to feel that they will be any less able to catch others than before. New users already see huge mountains ahead of them but don't demand that the people who've helped make this project should be forced to start over.

You can easily see from the benchmark threads here how many structures a range of machines can create per hour/day. Time the new algorithm over a day and work out how many points each "generation" should get. So we're no longer getting points per structure! So what! We could be getting points that equate to what has gone before.

Keep the current stats!

TheOtherPhil

01-17-2003, 06:23 AM

I don't think it will be that big of a problem....with your production Halo, you will reach number 10 pretty soon anyway and will be in the same position that you are objecting to. This is not just a protein change but a substantial change to the client, code and the entire project. This is indeed a different "phase" of the project and I believe that the stats should be zeroed.

FoBoT

01-17-2003, 07:11 AM

Originally posted by Michael H.W. Weber
3. Any (preliminary) news on the CASP results? I wonder why such drastic changes of the algorithm are undertaken (not satisfied with the current results?).

Michael.

i will look for the quote from howard (or he can jump back in here and repeat it himself), but howard posted that changes are needed to refine the process to be more competitive (ie get better results) with the other processes that turned in results to the CASP thingy (ok , i admit it, the science is all way over my head :crazy: )

any way, i will go look for the quote, hold on

here we go, i think you can get the idea from these 3 quotes

We will post a complete 'report' as soon as we have gone through all the relevant numbers. While I can tell you we did not 'win', we appear to have done reasonably well considering our algorithm is still in its infancy compared to many of the other participants. Clearly brute force alone is not sufficient to predict protein structures but in combination with a 'smart' algorithm we believe it can far exceed what anyone else will be able to do without DC. As we continue to test new enhancements to our current basic algorithm and scoring functions, we expect to see great improvements in our ability to sample structures and pick out low energy ones.

As for the new method, I am busy coding it now as we speak, the new infrastructure I'm putting in will allow us to try several different ideas, all sharing the common idea of iteration. i.e. make some structures locally, then take the best one(s) and do something with it to then make more structures, etc. Thus there will be a lots less uploading to the server but occasional downloading as well. Most changes will be invisible to the user though. We'll keep you posted as it develops.

Our current plan is to improve the sampling from 'random' structures to doing it more intelligently. We shouldn't need to sample 10 billion structures to get what we are getting, if we do it a bit more intelligently. Massive computational resources will still be required though

it seems to me (in my simple way) he is saying that improvements need to be made to get better results

for those involved from the beginning (about 1 year ago), howard has always intimated that this was a work in progress and that the science may dictate changes, after all what is the goal? to figure out a better way to do this folding stuff, right? :)

HaloJones

01-17-2003, 07:32 AM

Originally posted by TheOtherPhil
I don't think it will be that big of a problem....with your production Halo, you will reach number 10 pretty soon anyway and will be in the same position that you are objecting to. This is not just a protein change but a substantial change to the client, code and the entire project. This is indeed a different "phase" of the project and I believe that the stats should be zeroed.

Maybe I will reach #10 but maybe those coming up behind will beat me to it. The point is that the stats can change, there are people competing.

You zero the stats and within a week, everyone will have taken the place in the ladder and there will be no competition! Stats aren't everything BUT they provide interest and competition. If we put everyone back to zero the only competition will come from new joiners.

I realise I am in the minority here and will probably lose but I really don't think you all realise the inertia that will happen.

jkeating

01-17-2003, 09:28 AM

I'm not a top 10er - a top 100er last time I looked, however i've been with the project from the early stages... I'll add my voice in here and say "ok to restart the stats"...

Brian the Fist

01-17-2003, 09:34 AM

Its great to see so many first timers posting here, your input is especially appreciated (get tired of hearing from the same people all the time...) :p

A few people wondered about this so to clarify - whenever you stop the client and restart it later, it will continue exactly where it left off, on the same structure number and the same generation. You do not have to complete 50 generations before you upload, it'll upload once per generation. You will still be able to copy files to upload from a different machine, the exact files will just be a little different (instruction will be given in the readme).

I'll try to get some graphical summaries of our CASP performance posted today or next week on the Results pages.

The argument for zeroing the stats has nothing to do with the fact that the algorithm/scoring is changing, it has to do with the fact that we are beginning a new phase of the project, and can give some newcomers a chance to get a decent ranking. It is a convenient point at which to 'level the playing field' so to speak.

I will not erase 0-production users though, because who knows why they're at 0? Maybe they're buffering oodles of work and just preparing to upload it all? Or are having trouble getting through a firewall? Or whatever. My goal in the 'official' stats is to deliver to you all the raw information exactly as we have it. Then thanks to all your 3rd party stats dudes, you can filter out whatever you don't like and make it more presentable.

I think that addresses most of the issues mentioned here.

bwkaz

01-17-2003, 09:35 AM

Halo -- the question that jumps to my mind is, why isn't that "inertia" effect happening now? Especially with all the people that have been at the top for however long?

Or do I just not get what you're saying? (possible...)

You're saying "you zero the stats and within a week there will be no competition", but how does that make any sense when you're saying earlier that "stats can change"?

If this sounds overly confrontational, sorry, it isn't meant to, I'm wondering why you think that.

thezo

01-17-2003, 11:27 AM

To add my $.02:
I think looking at this next phase as a whole new challenge will be good for the project. I am excited about a new competition with zeroed out stats and that we may get better results. I admit that I don't fully understand that changes as of yet, mostly because I don't totally understand the underlying science of the whole project.

I don't think the idea not being able to move the cached results from one machine to another for upload is the best choice. From what I understand - this is a common practice and removing this option will imo drive some producers away. This isn't an issue for me, so I could be wrong.

Grumpy

01-17-2003, 11:32 AM

As a fully qualified Stats Ho for OCworkbench,
I am somewhat saddened by the impending
loss of our current work. Nevertheless,
the Project is why we are here, and as long
as we have a new stats system that allows
for internal and external competition similar
to the existing one, I see no problem. Our
Team thrives on the competition, and benefits
from the Benchmarking aspect also. The loss
of competition would have an impact on
output, as to how much I cannot hazard a guess.

scruff35

01-17-2003, 11:34 AM

I`m new to DF. I have been with the AMDMB Killer Barbarian Frogs for a month now. I`m also just like jaydee116, i just switched from AMDMB`s Seti team. I was there for almost 2 years. Now i`m with the DF project for the same reasons. I really like this project and will continue crunching for it no matter what. I would like to say that i`m for the reset of the stats, with the older stats still active for viewing for all that have contributed. Just want to end this on a positive note. I think the one`s that stick around after the stats are zero`ed will be the one`s that care deeply about this project. And hey wouldn`t be fun passing everybody again, for the big/small crunchers out there? ;)

FoBoT

01-17-2003, 11:39 AM

Originally posted by Brian the Fist
(get tired of hearing from the same people all the time...) :p

:cry:

now you've gone and hurt my feelings ;)

:crazy:

FoBoT

01-17-2003, 11:42 AM

Originally posted by Grumpy
As a fully qualified Stats Ho for OCworkbench,

:rotfl:

:rotfl:

:jester:

where do i get my certificate? :crazy:

bguinto1

01-17-2003, 02:20 PM

When are you targeting this new algorithm? Will it be after this current protein is finished? Also, are you intending for the current client to automatically update to this new algorithm or will we need to download a new application?

Thanks.

PY 222

01-17-2003, 02:26 PM

Just putting in my vote.

I vote for a reset of the current stats but keep all the records and display it under Phase 1 completed or something like that.

It would be good as I might be able to show my grandchildren one fine day about what I did when I was with DF and what position I was in :|party|:

Heck, it might just motivate them to run DF as well, if everyone is still around :D

vsemaska

01-17-2003, 03:12 PM

Originally posted by bguinto1
When are you targeting this new algorithm? Will it be after this current protein is finished? Also, are you intending for the current client to automatically update to this new algorithm or will we need to download a new application?

Thanks.

Howard said there'll be a beta test phase before its full release. So I assume that there'll be 1 or 2 more proteins that'll use the current algorithm.

Vic

tpdooley

01-17-2003, 03:48 PM

Halo:
I've been here since June, and production rates have never seemed to be constant for the members of OcWorkbench, or for the DF community at large. (Brian's 500 mil hasn't increased dramatically in the 6 months I've been here, for example - it looks like he was allowed more time on the local MegaWomp during the start of the project than he is now). The leading folder at OcW when I started had 25% of the folds for the whole team; and was running a horde of machines that made catching up with him with my 1 system look impossible. I've since passed him when I setup a host of machines to increase my production rate - and he's now dropped down to 1? machine.
Between losing hardware, losing interest in the project, or gaining hardware/increased interest in the project - everyone's production rates seem to change over time. And they'll most likely continue to vary in the future.

HaloJones

01-17-2003, 04:31 PM

The "inertia" isn't happening now because so many people started (and gave up) at different times. I'm closing in on ZaphodB directly above me and have just waved by Michel as he passed me while bguinto1 sped past a little while ago. These changes are becuase we started at different times.

Now picture the new system.

Lemonsqzz immediately is #1. No-one can or will ever threaten that place.
bguinto1 is immediately #2. He will get further behind Lemonsqzz every update.
Michel is immediately #3. He will get further behind bguinto1 every update.
etc. etc.

Unless one of them gives up or another one ramps up massively, the top n crunchers will be rigidly in place. At the bottom, it will be different as people have fewer machines and perhaps don't run 24/7. Upgrading to a better processor or running overnight can change a smaller producer's output significantly enabling relative acceleration. When you have 10s of computers running 24/7, it is much harder to impact output - particularly over the short-term.

For this reason, the top 100 or so are likely to be boringly static.

Whatever, since no-one else objects to a reset, my argument is moot. I reluctantly withdraw my objections and will attempt to put in as much effort to DF.II as I have to DF.I.

Grumpy

01-17-2003, 08:17 PM

University of Imaho - Statistics Without Morals 101

Brian the Roman

01-17-2003, 08:57 PM

Howard,
could you give us a quick explanation of how the structures of generation 0 are different from those of later generations. You say later gens will take longer per structure and one therefore assumes that you believe their probability of being better structures is higher. How are you doing this? Are you possibly using the best energies calcd from the previous generation and 'looking around the neighbourhood'?

ms

reader50

01-18-2003, 12:11 AM

I live in California, the electricity is not cheap. And paid for out of my own pocket, not on a business ledger.

From a stats-keeping standpoint, restarting with lower totals (and ones that rise slower) would be nice. But I've pretty well covered the integer overflow problems.

My vote is to keep the stats units I've paid for, not to put them out to pasture. My total is not in the Top 10 (or anywhere near), but it still kept my bedroom noisy for weeks on end. Slowed my games down too. (serious sacrifice there ;) )

Grumpy

01-18-2003, 12:30 AM

This is the biggest problem with zeroing the stats, people who are not large folders have put in a lot of effort to get where they are. To suddenly wipe all their effort away disadvantages them a lot more than the top 100. Think of those on only one computer, a slow one, that have been in it from the start. They are about to hit 10 Million after all that effort...bang, stats zeroed. I abide by the referee, but I am in the keep the stats camp..Erm, can not you have both running at the same time.. Just have an option to choose Display Combined Stats or Display New Project Stats . Users can then follow whichever method they wish :confused:

FBK

01-18-2003, 02:03 AM

8 Months ago I joined DF. Since then, I added 4 folding machines, to my existing 6 machines. My monthly electric bill jumped significantly. I monitored and tended my personal farm. I crunched, 7x24, since then, through slow genes as well as faster genes. I lost work from overloaded server(s), 904 errors, failed updates and a half dozen other reasons. I didn't expect anything in return. But, I am proud of my contribution, and that contribution is reflected only through stats.

8 months ago, many other people joined DF. Some are still folding. Some folded for a while, but have long since quit. Some bought and supported personal farms. Some put forth the gargantuan efforts to Borg work machines. Some NEVER FOLDED a SINGLE UNIT.

In a few weeks, apparently, we will all be equal on the stats.

Perhaps I'm a stats whore. Perhaps I'm not noble or righteous.

Somehow zeroing the statistics, and my contribution, just doesn't seem right.

FBK

MAD-ness

01-18-2003, 02:10 AM

I am glad that some alternative viewpoints have been expressed.

I knew that some people would not be excited at the prospect of a stats zeroing but it took them a while to show up and voice an opinion.

I can go either way, neither will make or break me but for some people it is a larger issue. Understandably so for those who did a lot of work keeping networks running or footing higher energy bills to keep the home computers on 24/7 and crunching away.

Goobee

01-18-2003, 03:39 AM

I would rather see the current points rolled over. My farm does not run 24/7 for free; in a manner of speaking, I paid for the points I produced. I do not participate in non-medical DC projects, only in projects that may one day lead to finding cures for illnesses/diseases.

In return for my expenses, I get stats. I like stats, many of us do. My points were difficult to come by. All of my machines are at home - if I could enlist a bunch of machines at work I would but I can't. My points are generated from electricty paid by me; if you ask around, many DC'ers are paying the electrical bills out of their pocket.

You may want to keep this in mind when you guys make your final decision.

Grumpy

01-18-2003, 03:53 AM

As part of the Democratic Process at OCworkbench, I have started a poll on the issue. Members can choose to keep it the way it is, zero the stats, or have a system that has OLD+NEW stats counted OR just count the New stats ( So you would have 2 parallel ladders running and the individual or Team can decide which they prefer to follow). I will post the results when the poll closes.

Perhaps other Teams should do likewise and put their results here too :cheers:

OgreChow

01-18-2003, 10:32 AM

Don't you all see that once the stats are reset they will show exactly how much you are contributing RIGHT NOW?

This means that if you are upset because you are working very hard, all of a sudden you will be working very hard and everyone will know it!

As for concerns that it will stay this way, static - perhaps at first...until you add a computer to try to bypass the person ahead of you, or someone switches teams, or someone loses interest - in short, all the things that have led to stat-climbing and dropping today. It will just be a short sprinting lapse before we being the long-distance race in which we are currently engaged.

I personally think that the first few hours after the reset are going to be incredibly exciting - I know I am going to find a few more machines to crunch on and will be updating every few minutes ;)

-OC

PinHead

01-18-2003, 11:01 AM

I don't have a preference either way on the stats!

But if they are zeroed, I think the folders are looking for some type of recognition for the work that they have done. Either a static past results page or a trinket of some type by their name, to indicate their participation in phase 1. Maybe even just a number next to their name indicating phase 1 ranking.

Also, will this progie have new hardware requirements?
Higher minimum cpu or memory?

Brian the Fist

01-18-2003, 11:22 AM

Well, you just gave me a somewhat wild idea there, inspired by our friends eBay. What if we zero the stats as described, but keep (internally) a record of your production in phase I. Then we internally add phase I + phase II production and call that total production.

Stats pages will show only phase II production BUT you will get a series of coloured stars or other objects next to your name based on total production (I noticed some 3rd part stats pages do something like this already). Thus you can quickly identify people who've been around for a long time and contributed a lot even though their phase 2 production may not be so much.

Lemme know, and if you have a preferred color scheme..

To Brian the Roman:

Initially, the generation N structures will be generated as 'near neighbours' to the lowest energy/RMSD structure from generation N-1, and so on. Later on this may change slightly, and generation N structures may be random combinations of fragments from generation N-1 structures.

Faceless

01-18-2003, 11:29 AM

Whatever you want to do is OK by me. I don't want to be a whinner wanting the stats reset just because ExtremeDC is not on top of this project. The project is worth while and I'm planning to stick with it.

cygnussphere

01-18-2003, 11:40 AM

Hum.... Me clicks on link to Dyys Top 1000 users page.

Scrolls to page 4. current position 168/1000

now clicks on handy sort by last weeks production button.

Wasza! scrolls back to pg 2 status now 76/1000.

Where do I find the reset button on this thing?
:D

Actually, The dual recognition plan does seem like a good idea.

Fellow M.O.B. (minion of brian) :cheers:

Insidious

01-18-2003, 12:24 PM

First of all, I am here for the duration no matter what is done with the stats.

Is it correct that everyone here is talking about stats pages other than the one on the Distributedfolding.org site? I don't find the information everyone is referencing on that one.

(I use statsman.org and stats.zerothelement.com)

Doesn't all this discussion only apply to Dyy and whoever does Statsman?

My utopic stats page would open to dynamic stats of Phase II with a button that would link me to the (static) Phase I final results.

(I do like the star idea identifying a user who has Phase I input next to the name on the Phase II page)

-Sid

grobinette

01-18-2003, 12:56 PM

This is getting better, I like the zero out the stats but with some form of recognition for phase 1 activity idea....

Anarchy99

01-18-2003, 01:31 PM

leave the stats ppppllllleeeeeeeaaaaasssssseeee

"no disassemble" johnny 5

"we likes our precious stats we does" gollum/smeagol

"stats good hhmmmmm" yoda

"we like the stats the way they are pilgrim" john wayne

you could zero them but leave us a slightly transparent block with the previous effort made :D

"me thinks you a got it now" jar jar binks

"happy birthday mr president" marilyn monroe :|ot|: [but what a hottie]

some consideration to the members who cant climb as fast to show how long they have been at it, helps them justify the effort
and gives them a benchmark for the time they devoted.

but I do understand the guiding principle behind a zero move :thumbs:

A99

cygnussphere

01-18-2003, 03:03 PM

stats.zerothelement.com/index.php (http://stats.zerothelement.com/index.php)

These would be the stats referanced previously

Starfish

01-18-2003, 03:10 PM

Originally posted by Insidious

(I do like the star idea identifying a user who has Phase I input next to the name on the Phase II page)

-Sid

I too like the idea of a star indentifying who has helped during Phase I.

I can understand that people who have done 10+ Million or even 100+ Million (took me a ~ year just to break 5M two weeks ago :cry: ) are not really waiting for a complete reset.

Perhaps the star idea can be realized in such a way that there are a few star colors like you see in some statspages?

Example: Den's Distributed Folding stats :: Team Endeavor:
http://users.neotechus.com/~hruzaden/folding/stats/endeavor-df-stats.html

(especially see the overview on the bottom of the page)

possible solutions are:

color x every xx structures
color y every yy structures

or

color x between x and y structures
and color y between y and z structures

That would give the people who'll "lose most" some distinction instead of "just one Phase I star" ;)

Whatever the decision:

It's great to see that we're about to move to a next Phase..Cheers to all who helped achieving this!

:cheers:

FoBoT

01-18-2003, 03:27 PM

Originally posted by Brian the Fist
Well, you just gave me a somewhat wild idea there, inspired by our friends eBay. What if we zero the stats as described, but keep (internally) a record of your production in phase I. Then we internally add phase I + phase II production and call that total production.

Stats pages will show only phase II production BUT you will get a series of coloured stars or other objects next to your name based on total production (I noticed some 3rd part stats pages do something like this already). Thus you can quickly identify people who've been around for a long time and contributed a lot even though their phase 2 production may not be so much.

Lemme know, and if you have a preferred color scheme..

just can't stop thinking of a better way, can you? :D

i am sure that would be more palatable to the people that think moving to a new phase in the project is "zeroing" the stats

i can't wait!! :|party|:

Anarchy99

01-18-2003, 04:25 PM

hey I like the star Idea :)

reader50

01-18-2003, 05:23 PM

Question1: if units crunched today are going to be tossed, why should I crunch today?

We are fighting off another team in Folding@home this week, but I was crunching 24/7 last week.

Question2: if our stats are taken away, why should we crunch Distributed Folding in the future? There would be a precedent for taking our stats away now and then.

AMD_is_logical

01-18-2003, 05:44 PM

Well, I voted for continuing with the old stats, but if we go with the star idea, how about a sort of spectrum.

* - up to 5M
* - 5M up to 10M
* - 10M up to 20M
* - 20M up to 50M
* - 50M up to 100M
* - 100M up to 200M
* - 200M up to 500M
* - 500M up to 1000M
* - 1000M and up

Note that I used an asterisk as a star. There may be a better font+character abailable. The page shouldn't try to load a bunch of .gif files, though. That would make the stats pages take longer to load.

Goobee

01-18-2003, 06:51 PM

While a few of us have lodge our protests with respect to the zeroing of points, that issue appears moot at this point.

It seems that will happen whatever is argued here - the only real issue is what to do with the old results once the new project begins.

I do not see why it will be so difficult to link the totals from the old database to the new database and show the aggregates??!! A child of five can write simple code to link two tables and produce a sum.

Who cares if the algorithms in the old project vs. the new project are different..............stats are stats!

Are you guys asking for our opinions or simply giving us a rhetorical question. :confused:

KWSN Grim Reaper

01-18-2003, 11:18 PM

I REALLY like the star idea too, although I'd like us to continue with the current stats so that we don't loose anybody. Someone mentioned running a poll...but what would be the margin for zeroing the stats? 51%? That would leave 49% not very happy with it, what would be the acceptable 'loss' of participants? Clearly some people have a big problem with it :(

Ni!

FoBoT

01-18-2003, 11:25 PM

FlowerKid

01-19-2003, 12:21 AM

Exactly right FoBoT. This phase is ending, so I think the stats should be reset to reflect that. Just a designation of your final standing in Phase I would be sufficient for me, personally. I will still be crunching in this project no matter what the decision, and I hope others do too.

MAD-ness

01-19-2003, 01:14 AM

I like your avatar FK. :)

You ought to drop by and say hi to your old friends every so often. :(

Grumpy

01-19-2003, 01:24 AM

Well, if 49% are not happy with the zeroing, logic dictates at least having an aggregate added to the stats page..currently at OCworkbench the result is 55% against zeroing.

Starfish

01-19-2003, 08:15 AM

Originally posted by reader50
Question1: if units crunched today are going to be tossed, why should I crunch today?

We are fighting off another team in Folding@home this week, but I was crunching 24/7 last week.

Question2: if our stats are taken away, why should we crunch Distributed Folding in the future? There would be a precedent for taking our stats away now and then.

Answer1: I don't think that the actual work is going to be tossed away....the project is just coming close to a new Phase and the question is: what are we going to do with the stats now that the way of generating structures is going to change?

Answer2:
No matter what happens to the stats, the project has a certain set of goals/a cause it wants to achieve.

It's up to you if you find those goals/the cause more interesting then possible resets of the statistics in the future.

Personally: I like stats..it's nice to have a 'dynamic playing field' around you while you're generating structures...but the cause of the project is what keeps me running this client... if the stats were taken away things would become less fun...but someone has to do the work ;)

StrategyFreakAMD

01-19-2003, 10:27 AM

I am all for zeroing. This would help to get rid of all the people with many stuctures but not producing so the smaller teams actually have a chance. It will also get rid of the thousands of inactive people who haven't turned in a structure at all. Finally, if the top 10 do have enourmous production, they will have no trouble getting back to their old positions.

If you were to set up a poll about the stats, at least 2/3s of the people should be for zeroing the stats.

Bionic_Redneck

01-19-2003, 10:57 AM

I agree it won't make a difference if they split the stats. It might even get others involved in DF.

pointwood

01-19-2003, 11:11 AM

Reader50, if you're only doing this for the stats, then why did you choose DF? Why not RC5 that have a client which have smaller requirements? I know that if I where only doing it for the stats, then RC5 would be my choice.

Right now we have a large lead and a retirement of phase 1 would give others a good oppertunity to challenge us. While it is nice to be outproducing everyone else with a large margin, there is not much fun happening. It was much more fun when we where trying to catch Free-DC.

With that said, I don't really care whether the stats are being reset. You, Howard, should do what gives the best options for making the project grow in the future.

I really like the way, you, Howard, handle this - makes me feel appreciated :) And I think just the fact that it has been discussed here will go a long way to satisfy most people, nearly no matter how you end up making it.

if the stats are reset, then I really like the "star idea". That plus having a page with the stats from phase 1.

Brian the Fist

01-19-2003, 11:49 AM

Originally posted by reader50
Question1: if units crunched today are going to be tossed, why should I crunch today?

We are fighting off another team in Folding@home this week, but I was crunching 24/7 last week.

Question2: if our stats are taken away, why should we crunch Distributed Folding in the future? There would be a precedent for taking our stats away now and then.

a) because I didn't officially say stats will be zeroed yet (note I cleverly avoided actually stating this :cool: )

b) because now you will get a cool star colored based on your production since the start.

For the stars I was thinking of blatantly stealing them from eBay (would that be illegal or something???) cause I really like the eBay feedback stars, they're cool (come on, someone here must use eBay besides me..)

reader50

01-19-2003, 03:03 PM

Wish more people would jump into the "No" side of things. I'm looking like a troll by carrying most of it. :(

pointwood, I usually evaluate a project first, see if it is doing something worthwhile. Then crunch for the stats if it is. If the project does not seem worthwhile *RC5*, I'll still crunch a few units to get a feel for it, so I can write our Install pages and build stats for it. Sometimes, I even jump in when the team is under pressure. But that's about it.

Stats are the intangible return for our volunteered resources. Besides recording our contribution, they also allow us to play games on the side such as KWSN and others sometimes do. This is ok. :)

The project may see stats differently, but I see them as "ours". They got the free CPU time; we got the bills, noise, heat, and our stats. Since the stats seem tied to our contribution rather than a particular project codebase, I have yet to hear a good reason for nuking them. A bonus spelling-bee star is not what I had in mind. Not even if it's in color.

Stats accounts are usually zeroed for cheating. IE: you submitted fake units to SETI in an effort to rise up the charts faster, or to push your loser team above more honest teams. Such accounts should be zeroed. But all my units were crunched honesly on my single home computer. On all projects, not just Distributed Folding. I didn't cheat, and put considerable resources into reaching my present position.

I want to keep the stats I worked for. Taking them away means I have to either drop DF, ignore a pitiful position at the bottom of the team, or return and crunch upwards for several months. Time that could go to other projects that are also worthwhile. This would certainly be nice for the DF project, but I already put that time in.

Side note for Howard: Why do foldingathome.com (http://www.foldingathome.com), foldingathome.net (http://www.foldingathome.net), and foldingathome.org (http://www.foldingathome.org) all point to the DF project? Shouldn't they point to Stanford?

To others who wish to keep the stats they contributed: Please post. Contrary to popular opinion, Free-DC no longer requires you to join their DC teams before you can register on their boards. Also, the per-post charge is waived in the Distributed Folding project forum. Jump in. I can't carry it all, it's not worth getting banned.

pointwood

01-19-2003, 03:41 PM

Reader50, I certainly don't consider you a troll (and I don't think others do either).

As I think I've said before, I don't really care what Howards decision ends up being, I just don't care enough about the stats :)

The reason to do a stats reset IMHO is (at least partly) because it is something similar to 2 seperate projects.

As I understand it, we are starting from scratch (well, it builds upen the current version, but it is much improved upon and works a lot different too) and that make it somewhat natural to start from zero with the stats too.

FoBoT

01-19-2003, 04:02 PM

Originally posted by reader50
Side note for Howard: Why do foldingathome.com (http://www.foldingathome.com), foldingathome.net (http://www.foldingathome.net), and foldingathome.org (http://www.foldingathome.org) all point to the DF project? Shouldn't they point to Stanford?

uh , maybe howard was smart enough to pay $15/yr for those domains and the *people* (edited out a mean name calling word :rolleyes: ) at stanford aren't internet savvy enough to get those domains?

FoBoT

01-19-2003, 04:07 PM

howard is in a no win situation here

if he starts a new set of stats for DFII , X% of the current crunchers will be pissed off

if he comes up with some magic formula to equate DFII production to DF1 production, then X% of the current crunchers will say it is unfair because , either

A- the new scoring is biased towards the old phase 1 people or

B- the new scoring is biased towards the new phase 2 people

howard, i don't envy you dude, X% of the current people are going to crap on you regardless of your decision :(

i still think DF is #1 DC project going right now and am excited to see what the science of the NEW system will bring (this coming from a moderate stats ho!)

Insidious

01-19-2003, 04:26 PM

my impressions (without even looking at a stats page to back them up) would be that it seems odd that a participant would have such difficulty with being able to view what they accomplished in Phase I on a page and being able to view what they are presently doing in Phase II on a different page.

Could it be that there are users who did a lot for a while, but have decreased their commitment to a point that if only today's effort was on the page they would not look as good?

I have no sympathy for anyone who thinks there is actually much of an accomplishment to view if it all depends upon the past. I see nothing glorious about resting on ones' laurels!

I personally like the idea of those who have reduced their commitment to DF being displayed as such.

In other words:

If you were doing alot in the past, it showed then.... That was your 15 minutes. But why should Howard provide you a false front to make you look good if you are no longer putting in that kind of effort?

Aegion

01-19-2003, 04:57 PM

Originally posted by reader50

Side note for Howard: Why do foldingathome.com (http://www.foldingathome.com), foldingathome.net (http://www.foldingathome.net), and foldingathome.org (http://www.foldingathome.org) all point to the DF project? Shouldn't they point to Stanford?

The two projects were concieved of oringinally at about the same time. I believe Howard was going to call his project Folding at Home, but Stanford took the name for their project first. Read some of the details of the official DF site info for more info on when the Distributed Folding Project was first conceived of.

PinHead

01-19-2003, 05:34 PM

Why not the little origami pics from the website.

Clean them up, shrink them down a little and colorize them.

Anyway, if I am reading the emotions correctly, it is starting to sound like a current project stats page with a trinket and a second stats page with cummulative stats might make the majority happy.

I don't think that it is important why a person donated their computer time; the important thing is that they did choose to! So hopefully everyone can chip in and come up with some common ground.

horus2

01-19-2003, 05:42 PM

What I would like to see is 3 different stats databases. One with the first phase stats, one with the new phase and another one which is just a sum of the two. The final database would not even have to be updated as often.

I think the stars idea is a little gimicky though.

Dyyryath

01-19-2003, 06:41 PM

I feel pretty open minded about the whole thing. Zero the stats, don't zero the stats, create an aggregate of the stats, give us stars for past work, they all work just fine for me. My only concern is how any given decision will impact the project by irritating users. FoBoT's right, no matter what choice is made, people aren't going to be happy.

If it were me, I'd probably do it something like this...

Each Phase becomes a game in the larger 'World Series'. Given future refinements and Howard's dedication to continually improving the client and algorithms, what makes us think this will be the last 'Phase' anyway? And even if it's not, does that have to be a bad thing?

I'd have a set of stats for each phase that's finished, active stats for the 'current' phase, and 'overall' stats that are the sum of all the project phases to date.

My way of doing things wouldn't be the simplest way, though, and I'm sure that there is a limit to just how much time/energy/resources Howard is willing/able to to put into the way stats are handled.

The fact that he's given the whole stats thing as much consideration as he has puts him head and shoulders above his peers in other projects in my view.

It'd be cool if there was a way that those of us willing to could help out with the 'official stats' in some way, but that's probably more hassle than Howard wants to deal with as well.

Insidious

01-19-2003, 06:59 PM

Originally posted by Dyyryath

The fact that he's given the whole stats thing as much consideration as he has puts him head and shoulders above his peers in other projects in my view.

Here Here! :cheers:

Grumpy

01-19-2003, 08:02 PM

For a bit of trivia, the OCworkbench poll has so far recieved 20 replies or 1/4 of our active users. 7 Want the stats zeroed (35%) 7 want a system so you can choose old&new OR just new (35%) and 6 want to have just the old stats (30%). Not a large number of replies, but given the wide range of users we have from all over the world, I feel these numbers would remain the same no matter how large the vote becomes. The results.....people want the old stats to continue to be updated (65%) and 70% want to see the new stats by themselves .

I am sure you can interpret this a million other ways but i do not want to know :notworthy

As Fobot said, you will always make someone unhappy. :bang:

TheOtherZaphod

01-19-2003, 08:14 PM

I guess I've got to stop over here more often. I don't care all that much if you reset or re-calibrate your scoring system but I do have a few observations, so bear with me...

I think you need to carry something forward. People want acknowledgement for time served and effort put forth. Just a week or so ago FoBoT was looking to find his reg. date for example. I am closing in on my first half billion; that's something I'll want to remember, and celebrate somehow.

Whatever you do, you are doomed to piss a few people off. If you turn this into too much of a democratic process, you will end up with a mediocre committee-type decision. At some point you are going to have to just go with your gut. Personally, I can't stay mad, for long anyway, with someone who does that.

My home team, OCN, is currently engaged in some SETI nostalga(sp). If you reset scores we will probably never see the top ten again, and that's Ok with me. I am currently positioned just past the top ten, and crunching at about that same level. If nothing happens to change the current scoring my next year will be spoent going up a few places, then going back down about the same...

HaloJones has been creeping up on me for quite a while now, if you blow the counts away just before he passes, I'll give you a shiny Looney.

Participants are always coming and going. For a relatively young project there are a fair number of flat-liners, even in the top ranks. Getting them off the active stats pages will be good in some ways, and bad in others. I tend to set goals for myself that relate to passing people, flat or otherwise. I have Data and Jodie to thank for my humongous electric bills as much as myself. If we all go back to zero, we will still have new people enter the project, and leave it as well; that's just life.

I would like a nice "group photo" at the end of the current festivities. This has been a good group to compete with, and that too, deserves some acknowledgement.

edited to add that Howard had better bring this to an end soon, or I will be knocking him out of his own top 10 :moon: .

HaloJones

01-20-2003, 07:02 AM

HaloJones has been creeping up on me for quite a while now, if you blow the counts away just before he passes, I'll give you a shiny Looney.

Y'see Howard. Zero the stats before I pass him and he loves you. But I'll be seriously p*ssed :)

(Actually, on current production, I won't catch you, Mr !=Beeblebrox. And HaloJones was (will be?) definitely a girl (although I'm not :confused:? you will be!) )

Old stats still available, new stats available and a joint stats available. How can anyone be unhappy with that as a solution?

AMD_is_logical

01-20-2003, 08:34 AM

Originally posted by FoBoT
howard is in a no win situation here

if he starts a new set of stats for DFII , X% of the current crunchers will be pissed off

if he comes up with some magic formula to equate DFII production to DF1 production, then X% of the current crunchers will say it is unfair because , either

A- the new scoring is biased towards the old phase 1 people or

B- the new scoring is biased towards the new phase 2 people Your assertion B is false. People rarely complain when they see their computer producing better numbers. A while ago the client was updated and the new client generated about twice as many structures per CPU-hour. I don't recall anyone complaining about that. People like seeing their computers producing better numbers.

Thus, the assertion that Howard is in a no-win situation is also false. All he has to do is to make sure the new client produces better numbers than the old one.

OTOH, if word gets around that DF zeros the stats every now and then, a lot of stats-loving people who are looking for a project will look elsewhere. I like DF and will continue even if the stats are zeroed. But, back when I was deciding which project to move to, if someone had told me DF zeros it stats every now and then, I would probably have chosen some other project (or maybe even have stayed with Stanford's Genome).

AMD_is_logical

01-20-2003, 08:45 AM

Originally posted by horus2
What I would like to see is 3 different stats databases. One with the first phase stats, one with the new phase and another one which is just a sum of the two. The final database would not even have to be updated as often.I would like to see the cumulative stats updated as often as the stats are updated now. One of the things I really like about DF is that the stats are updated every 10 minutes like clockworks. And the cumulative stats are the ones I would be looking at. The ephemeral stats for the current phase would be of only passing interest to me.

FoBoT

01-20-2003, 08:49 AM

Originally posted by AMD_is_logical
Your assertion B is false. People rarely complain when they see their computer producing better numbers.

what about a person that currently has 100 million under the old system, but has recently dropped from 7 PC's to use down to just one?

their current rate under a scoring system that is relatively faster will be lower and thus they will lose position at a higher rate due to "the unfairness" of the new scoring system

if there is the smallest change, somebody will bitch about it, it seems to be human nature

if the only consideration is the least "noise" then that option might be the path of least resistance, but if what is best for the project overall is paramount, then perhaps another option will be chosen

luckily i don't have to decide, i am just another cruncher :p

gnewbury

01-20-2003, 08:51 AM

Originally posted by Dyyryath
I feel pretty open minded about the whole thing. Zero the stats, don't zero the stats, create an aggregate of the stats, give us stars for past work, they all work just fine for me. My only concern is how any given decision will impact the project by irritating users. FoBoT's right, no matter what choice is made, people aren't going to be happy.

If it were me, I'd probably do it something like this...

Each Phase becomes a game in the larger 'World Series'. Given future refinements and Howard's dedication to continually improving the client and algorithms, what makes us think this will be the last 'Phase' anyway? And even if it's not, does that have to be a bad thing?

I'd have a set of stats for each phase that's finished, active stats for the 'current' phase, and 'overall' stats that are the sum of all the project phases to date.
<snip>

As a long time lurker, dedicated F@H and G@H, I'm finally glad to see someone post some sense.
It is ridiculous to view any serious scientific DC project as never ending. As Dyyryath suggests treat is as a world series, or world cup. Set a date, tell us when the playoffs are and after the game we all shake hands and start over again. Perhaps with enough down time for a good disk defrag :)
Then do it every year.

FoBoT

01-20-2003, 08:58 AM

Originally posted by TheOtherZaphod

Whatever you do, you are doomed to piss a few people off. If you turn this into too much of a democratic process, you will end up with a mediocre committee-type decision. At some point you are going to have to just go with your gut. Personally, I can't stay mad, for long anyway, with someone who does that.

:notworthy

wise words
i hope we can get through this and into the next step of the project with a minimal of bad feelings

Brian the Fist

01-20-2003, 09:26 AM

Originally posted by reader50
. A bonus spelling-bee star is not what I had in mind. Not even if it's in color.

Maybe you won't like it, but, the star WILL in effect represent the billion or so structures you have crunched in the past. So we won't be taking them away from you, simply presenting it in graphical format :cool:

Side note for Howard: Why do foldingathome.com (http://www.foldingathome.com), foldingathome.net (http://www.foldingathome.net), and foldingathome.org (http://www.foldingathome.org) all point to the DF project? Shouldn't they point to Stanford?

Because evidently we plan ahead better than they do...
We owned those before they even existed

HaloJones

01-20-2003, 11:16 AM

If Dyyryath could announce that he will do old/new/joint stats, I think most people would be satisfied. Or course, we would all have to continue with our current handles but I have seen no intention for that to change.

This is simply about how to score isn't it? Some people will say "I'm #xx in the new stats" others will say "I'm #yy in the joint stats" What's the harm in that?

But who is gonna say "I'm #xx at DF and hey, I've got a little green star against my name!" ???

Dyyryath

01-20-2003, 11:47 AM

Originally posted by HaloJones
If Dyyryath could announce that he will do old/new/joint stats, I think most people would be satisfied. Or course, we would all have to continue with our current handles but I have seen no intention for that to change.

This is simply about how to score isn't it? Some people will say "I'm #xx in the new stats" others will say "I'm #yy in the joint stats" What's the harm in that?

But who is gonna say "I'm #xx at DF and hey, I've got a little green star against my name!" ???

I'd definately be willing to do multiple sets of stats (phase 1, phase 2, total). It should be no problem at all. I don't know if it would really satisfy everyone since they wouldn't be the 'official' stats at the DF project website, but if it helps at all, I'll be happy to do it.

Actually, I don't know if they'd even be interested, but I'd probably be willing to provide either the software I'd be using to generate multiple stats sets to the project itself to use as they see fit, or to completely generate and host an 'official' version of the stats (much like we do this forum) themed to match the rest of the DF site.

I don't forsee the 'handle' thing being a problem. If by 'handle' you mean your username, then you could change it to whatever you want. Your unique ID would remain the same and I could still find you in the old & current stats (to create the 'total' stats) with no problem. If by 'handle' you mean the handle we use to identify our clients to the project, then it would work just like it does now. If you use a different handle on some clients, you're creating a new user which will be tracked differently in the stats than your original user.

Goobee

01-20-2003, 09:50 PM

Originally posted by reader50
Wish more people would jump into the "No" side of things.

Well, I've posted twice already. Seems to have made zip difference. (ie: ignored) The stats will be zeroed and replaced with little stars.

The last time I checked, little colored stars only meant something in Kindergarten. :rolleyes:

PinHead

01-20-2003, 10:20 PM

I think it rather odd that this much "opinion" could be generated by a quantity stat!

No one has even mentioned that the most important stat gets zeroed at every protein change. No stats page ( that I use ) keeps track of your best fold to date ( only current protein ). Yet no one seems to be getting worked up over that stat!

Sorry, just thinking out loud.

reader50

01-20-2003, 10:52 PM

PinHead, I think you are looking for a graph of your Best RMSD over time? If so...

http://teamstats.macnn.com/dfold/stats.php?page=p2&TID=76&UID=90&timelow=1032600458

PinHead

01-20-2003, 11:53 PM

Originally posted by reader50
PinHead, I think you are looking for a graph of your Best RMSD over time? If so...

http://teamstats.macnn.com/dfold/stats.php?page=p2&TID=76&UID=90&timelow=1032600458

Bookmarked that one!:shocked:
That is more stats than any one person should be allowed.

Thanks for the link.:)

Grumpy

01-21-2003, 05:13 AM

Well, after all the posts, all the fuss, it seems obvious that users still want the stats to continue on, but also want a separate count of the new format. (Our little survey has the combined system at 40% now, though it has way too small a user base to be anything but a sample).

The solution seems simple. Distributed Folding has the new stats with the stars, and Dyyryath looks after the combined system..after all, his stats site is THE site for stats, having put a lot of work and effort into it. Most would agree that they go to his site to check their personal and Team progress.

The only piece of the puzzle missing is a system to ensure the new units are within the range of the old client so passing/being passed is not abnormally slow or fast..it would be best if the new stats system is calculated to do this rather than devising a conversion routine.

Thecommi

01-21-2003, 08:36 AM

hello everybody, I had a brain storm come to mind of how we can reset the stats, but maybe make people less angry about it. There are a zillion posts on this subject, and didnt have time to read them all, but i figured i'd put in my idea and see if it makes sence to everyone. I read that each machine would have to have its own ID, drastically slowing down the production for users that have DF running on multipule machines, what if on the stats list, next to your user name, you have a number.... kind of like the star idea (which is great) That number represents the number of machines that belong to you. for the individual id's, set them up in kind of like a Personal Team, not a group team, but saying all these ID's belong to you. Than if you wish you could join a larger Group team, and have full statistics on how many accuall machines are being used in that team. This would let everyone start off strong, and the people who would be effected by the zeroing of the stats would still have their edge. I hope i didnt leave anythign out, but i'm sure if i did, i'll be hearing for you. :cool:

Brian the Fist

01-21-2003, 09:21 AM

I think I would like to put some links to the 'unofficial' stats pages on the main web site, with your consent of course. Im sure there are people who would like to make use of them but do not know about them yet. I will only do this for people keeping stats for all participants though, not if you're just tracking your own team. Let me know the exact link you want me to put up and I'll put it right under the stats button on the front page, or something like that and call it Unofficial Stats.

Thanks.

Dyyryath

01-21-2003, 12:41 PM

I certainly don't mind you linking to my pages. The whole point in building them is to help generate interest in the project, so...the more the merrier. ;)

Calling them 'Dyyryath's DF Stats' or 'Arachnid Stats System' or something similar would be fine. The link should probably go to the status page at http://stats.zerothelement.com. That page links down into all of the stats pages and generally contains information about the system. It'll soon contain some 'help' information as well.

If there's anything you'd personally like to see added or changed, let me know. As I said, the whole point is to help support the project. :thumbs:

Shaktai

01-21-2003, 01:09 PM

Personally I don't have a preference, but I can see the concerns that others have.

I can understand the idea of resetting the stats from just a project effeciency point. I also think Howard's idea of tracking the new stats but have a way of adding in the old stats for a total production # is good too. It is very important I think to have some method of giving credit for previous efforts from phase I.

Still the entire idea will have an impact on a lot of individuals. dFold is not the only project facing this kind of a situation. SETI is facing the same situation with the pending release of SETI2 and BOINC. Other projects may eventually face the same issue as well. As software and hardware technology improves, it may be periodially necessary to so radically revamp platforms that it becomes necessary to also change how results are tracked.

I vote for a reset for the new, but with a total production "number" not stars, that also reflects what both individuals and teams have previously contributed.

Also, when the change is made, ensure that clients are available for all of the major OS platforms at the same time so that no group or individuals are left at a disadvantage.

Scotttheking

01-21-2003, 08:08 PM

Originally posted by Brian the Fist
I think I would like to put some links to the 'unofficial' stats pages on the main web site, with your consent of course. Im sure there are people who would like to make use of them but do not know about them yet. I will only do this for people keeping stats for all participants though, not if you're just tracking your own team. Let me know the exact link you want me to put up and I'll put it right under the stats button on the front page, or something like that and call it Unofficial Stats.

Thanks.

I'm fine with that, but I don't know if we are tracking every team right now. I'll see if it can be done, not sure if the server can handle the load or not.
I know we are running them for all the major teams.
I'll tell reader50 to stop by.

reader50

01-21-2003, 08:21 PM

Scott, let me know when you plan to add the $15K of extra servers, and the SCSI RAID 10. (reader50 rubs hands together) :)

We do not qualify. We track 14 teams today, and expect to track 10 to 50 teams in full detail at any given time. Even if we wanted to track everyone, our server capacity is perhaps 10% of what we would need. Should this change in the future, we can certainly give Howard a call.

Scotttheking

01-21-2003, 08:27 PM

Originally posted by reader50
Scott, let me know when you plan to add the $15K of extra servers, and the SCSI RAID 10. (reader50 rubs hands together) :)

Never. Raid5 I am planning on, but I'm not getting raid10.

We do not qualify. We track 14 teams today, and expect to track 10 to 50 teams in full detail at any given time. Even if we wanted to track everyone, our server capacity is perhaps 10% of what we would need. Should this change in the future, we can certainly give Howard a call.

Ok.

FlowerKid

01-21-2003, 08:48 PM

You Mac folks do have some great stats, I checked them often when I was with Ars TSF. I have since moved to ExtremeDC and I do miss your stats, but I think overall it would be great if the official page listed the "unofficial" stats that have been set up. All the effort that has been put into those stats should be recognized. :thumbs:

I think the analogy of what SETI is coming up on was a good one. When a phase ends, we should start tracking over again. I believe I speak for ExtremeDC's folding team when I say that we are in favor of a new system, but will continue with the project either way. After all, it's only stats.

The_Equivocator

01-21-2003, 11:16 PM

I'm with reader50 on this one. I don't feel that zeroing the stats is a fair approach to this situation. I put some major effort into really improving my stats over the summer while I had access to machines that are normally not mine to use. Trashing my stats is not something I'd like to see happen. I worked for those stats, as I'm continuing to work on improving them right now. If my stats are going to just vanish, I have much less incentive to dedicate CPU time to dFold as it currently stands when there are other projects whose stats have not been deleted in years.

And then, when phase3 starts, what's going to happen? Will the stats be deleted again? I just want to have some ranking system available to me that displayes my total, all-encompasing dedication to the DF project. It's just not fair to delete the stats I've been working on for so long.

I wouldn't mind having a new stats category for phase 2, as long as the main category of stats was merely phase1 + phase2. That way, if people are interested, they can see the difference between the two. But, the aggregate stats should still be the deciding factor in determing ranking.

And, I'm sorry, but I think colored stars next to our names displaying our "overall stat total" is ridiculous. Just keep the numbers as they are! Or even edit them so they are more fair/compatible with how the new stats will be. If it is going to take 10 times as long to get 50 stat points in the future, then multiply the new stat number by 10! Or even divide our current totals by 10. Just keep it fair, and let those of us who have really worked on our stats to keep them.

Maybe even go a Folding@Home route where 1 stat unit is equivalent to maybe 1 minute of crunching time on a 1GHz P3. That would make the stats fair forever, through any changes, and I'm sure you could figure out some way of converting the current stats over to something like that.

reader50

01-22-2003, 01:55 AM

FlowerKid, here (http://teamstats.macnn.com/dfold/stats.php?TID=1820)

pointwood

01-22-2003, 02:54 AM

The_Equivocator: I think the problem is that it is very difficult (if not impossible) to make it completely fair.

The_Equivocator

01-22-2003, 08:59 AM

Originally posted by pointwood
The_Equivocator: I think the problem is that it is very difficult (if not impossible) to make it completely fair.

While I agree it is impossible to find something completely 100% fair, I still agree that in this situation, there is an A fairer than B.

pointwood

01-22-2003, 12:02 PM

which is to be more fair to those already crunching or those who starts now?

What is best for the project? What will make the project grow most?

The_Equivocator

01-22-2003, 12:26 PM

Originally posted by pointwood
which is to be more fair to those already crunching or those who starts now?

What is best for the project? What will make the project grow most?

I think that it is fair to both parties to just keep the stats. No new members are going to say, "This isn't fair! Why do I have 0 stats points even though I haven't yet contributed any processor time to DF? These other people have tens of thousands of stats points and have dedicated hundreds of hours of CPU time and they have higher stats than me? Unfair!"

As long as a scale is set so that the stats gained from 100 hours of computing time in phase 1 = the stats gained from 100 hours of computing time in phase 2, everyone should find that fair.

TheOtherZaphod

01-22-2003, 12:48 PM

Just as a little food for thought I did a quick, counting-on-my-fingers-type, survey of the current project production. (forgive the rough math, but I really didn't do more than just glance at the numbers)

The top five contributors produced over 10% of last weeks total.

Going down thru the top twelve or so, you get about 20%.

It takes the top 30 to get 1/3 of the production.

Just about the top 80 contributors are needed to get to 50%.

The top 200 users did just over 2/3 of the weeks work.

Frankly, I was surprised to see that it wasn't more heavily weighted towards the top. If this were a game-show I would have guessed that the top 50 players did half the work, and that the top 100 did two thirds or more.

Please draw your own conclusions, but to me this says that even though heavy hitters make an obvious difference to the project, that more work than you would think is done by individuals, most running just a single machine.

Brian the Fist

01-22-2003, 01:25 PM

Originally posted by reader50
Scott, let me know when you plan to add the $15K of extra servers, and the SCSI RAID 10. (reader50 rubs hands together) :)

We do not qualify. We track 14 teams today, and expect to track 10 to 50 teams in full detail at any given time. Even if we wanted to track everyone, our server capacity is perhaps 10% of what we would need. Should this change in the future, we can certainly give Howard a call.

It is not out of the question that WE could host 3rd party stats on our servers, provided you were willing to give us your code of course, and provided it didn't eat up a lot of our bandwidth which is all that we are short on. We have no shortage of CPU or disk space really :cool:

FoBoT

01-22-2003, 02:19 PM

Originally posted by TheOtherZaphod

Please draw your own conclusions, but to me this says that even though heavy hitters make an obvious difference to the project, that more work than you would think is done by individuals, most running just a single machine.

i think that is why there is a "distributed" in "distributed computing" ;)

if the project grows in participants, those number will continue to skew away from the "Big" producers and even more heavily towards the "little guy" , there just aren't that many people with access to large numbers of PC's (or the fortitude to stick with baby sitting large # of PC's, it takes a lot of time/effort, especially if the box isn't directly connected to the internet)

Dyyryath

01-22-2003, 02:30 PM

MAD-ness

01-22-2003, 03:13 PM

TheOtherZaphod: thanks for doing that analysis, I hadn't taken the time to do one in a LONG time.

As the project grows the trend should continue to be towards the masses doing more and more work in comparison to the 'big hitters,' as someone else pointed out.

TSF was able to finally catch Free-DC and KWSN! because Ars just brought in so many users. As the project has matured and as it gets additional publicity, this will continue to be the case, for the project as a whole and for teams as well.

Especially with other projects ending (RC-5, ECCp109, GAH1 [not official, but as good as dead] and soon to be SETI 1).

'tis a good thing not to rely on a small number of people for production. :)

BTW, Douglas Adams fan? :)

pointwood

01-22-2003, 04:55 PM

Originally posted by The_Equivocator
I think that it is fair to both parties to just keep the stats. No new members are going to say, "This isn't fair! Why do I have 0 stats points even though I haven't yet contributed any processor time to DF? These other people have tens of thousands of stats points and have dedicated hundreds of hours of CPU time and they have higher stats than me? Unfair!" Of course not, especially not if the new WU's are a bit faster, but will some of the old users not complain then?

As long as a scale is set so that the stats gained from 100 hours of computing time in phase 1 = the stats gained from 100 hours of computing time in phase 2, everyone should find that fair. Which is exactly what will be difficult (if not impossible) to do. Please correct me if I'm wrong.

Have you read the whole thread? It think this has already been discussed (without a conclusion though).

Brian the Fist

01-22-2003, 06:33 PM

I think it is fair to say that the issue of zeroing the stats has been discussed ad nauseum :bang: and, at the risk of sounding cliche, any further comments would be akin to beating a dead horse. As a wise man once said, you can please all of the people some of the time, and all of the people some of the time.. well you know the rest. Rest assured I have read this entire thread and, whatever we choose to do, well, you'll learn to like it :rotfl:
Seriously though, it will be implemented in such a way that any or all of the suggestions discussed COULD be implemented, at any time after the fact. So regardless of what we choose to do, it may change or be enhanced if need be. The beta testing will hopefully identify any potential problems as well. So in the words of another wise man 'Don't Panic'.

Apparently no one has any other issues which was the original question of this thread, which is a good thing I guess. I'll be doing some serious alpha testing next week and hopefully have it ready for distribution shortly thereafter. The initial beta will likely be for Linux and Windows only, and only teh text client, but all supported OSes will be released together when it becomes official of course.

In response to Zaphods analysis: thanks, that was cool, and pretty much what we expected. That's exactly why we made the screensaver cooler, trying to attract more of the 'little guys'. After all, no matter how many computer lemonsqzz, or M2K1Guy or Brian the Fist can scrounge together, it will still be piddly compared to the ultimate potential available from 'average Janes and Joes' running the screensaver on their computers, and it is this sort of audience who we have to try to appeal to now to continue to swell our ranks (since we've cornered the maket on you hard-core folks already :D ). The most difficult task will be making this sort of user aware of our project and its goals in the first place, so the more attention we can draw to our project, the better.

P.S. some CASP5 results are up on the web site

FBK

01-22-2003, 06:38 PM

Of course not, especially not if the new WU's are a bit faster, but will some of the old users not complain then?

As an "Old User", I think that I would prefer my stats to be worth less, as opposed to my old stats being worth nothing at all.

As long as a scale is set so that the stats gained from 100 hours of computing time in phase 1 = the stats gained from 100 hours of computing time in phase 2, everyone should find that fair.

Which is exactly what will be difficult (if not impossible) to do. Please correct me if I'm wrong.

Please correct me if I'M wrong, but no attempt was ever made to equalize stats since the begining of the DF project. I.E., any given machine, running since the inception of DF, might have folded 4500 folds per hour, on one target. 11,000 fold per hour, on the next target. And 3,200 fold per hour on another target.

FBK

01-22-2003, 07:36 PM

My last message composed BEFORE Brians "ad nauseum" message. Sorry.

I'm sorry, but I must point out that.

Zaphod said:

Just about the top 80 contributors are needed to get to 50%.

Correct me if I'm wrong, I'm surely no mathmagician, but does that means that 1/28th of the TOP CURRENT CONTRIBUTORS, contributed 14/28th's of last week's production?

That seems like a lot, per capita?

FBK

TheOtherZaphod

01-22-2003, 09:39 PM

LOL, just how many machines do you think those 80 people have involved in the project? I would guess that there is the equivilant of over 1000 full time machines in that group. You ran a "small" farm yourself; how many boxes was that?

FBK

01-22-2003, 10:30 PM

Exacltly. Certain folks such as yourself, and to a lesser extent myself, run farms, either at home or at work.

Accepting your stats, I come to a startalingly, different conclusion.

You are surprised that the top producers, numerically, produce a relatively small ammount of the total production.

I think that the large producers, such as yourself, and others, produce not only big numbers, but also a disproportionally large percentage of total production.

Sorry for that long sentance.

My percentage figures, based on your figures is as follows:

The top five contributors produced over 10% of last weeks total.

*The top .2 percent, (1/5 of 1 percent), (5 users out of 2308), of "ACTIVE" contributers produced 10% of last weeks total !

* The top .5 percent (1/2 percent), (12 users, out of 2308)of "ACTIVE" contributers produced 20% of last weeks total.

* The top 1.29 percent, (30 users, out of 2308)of "ACTIVE" contributers produced 33% of last weeks total.

* The top 3.5 percent, (80 users, out of 2308)of "ACTIVE" contributers produced 50% of last weeks total.

* The top 8.7 percent, (200 users, out of 2308)of "ACTIVE" contributers produced 66% of last weeks total.

FBK

FBK

01-22-2003, 10:38 PM

Opps sorry for the double post. I really must get out and post more often

Tawcan

01-22-2003, 10:49 PM

Could someone summarize what's going on with the new algorithm and the stats system? Thanx. :)

It's a good thing to see the project managers want to hear from the users and are making the project user oriented. :thumbs:

Cheers! :cheers:

FlowerKid

01-22-2003, 11:46 PM

reader50...thanks, I promise to say nothing but nice things about Macs for the rest of the week, I'll even drop into our graphic design shop and say it to them (Mac loyalists)

Howard, ultimately I believe that you will do what you feel is best for the project. I am glad that I don't have to be the one to make the decision. I'm sure we will all learn to live with it over time, no matter what the decision.

reader50

01-23-2003, 12:37 AM

Originally posted by Brian the Fist
It is not out of the question that WE could host 3rd party stats on our servers, provided you were willing to give us your code of course, and provided it didn't eat up a lot of our bandwidth which is all that we are short on. We have no shortage of CPU or disk space really :cool: Our current strategic plan does not call for tracking all members, or all teams. We just want to show off what cool stats our team has, as in "wow, let's join them!" This strategy does not require all teams to be tracked. Our motivation is team competition, and the offer would not fit. If special references to a specific team were removed, our team gains nothing. If such references are left, the project is playing team favorites on officially hosted pages. However, I'd like to comment on the offer in general. Like Dyyryath says, it's incredibly generous. Also risky, I would recommend against doing such a thing.

First, there are security problems. My code is approaching 20K lines of code for tracking Distributed Folding. It could reasonably reach 30K lines before it has all of the currently planned features. Putting that much foreign code on a project server ... my code would not go looking for passwords or emails, but how could you be sure? Howard would have to sift through all that code, if only to protect the other project servers. It could take weeks to be sure about all of it, assuming Howard worked on nothing else during that time.

When I have a bugfix ready, or a new feature, he would have to review the foreign files again. Bug fixes or feature upgrades often involve modifications to dozens of site files. At the very least, I'd have to nag him each time to replace certain files. He could give an outside party (me) access to one of the project's servers and skip the file checking/nagging, but that would be even worse. If the donated server were separated from the project servers, then it would require additional bandwidth to reach the source data.

Any codebase that does serious 3rd party stats unattended is going to be large, it has to be to deal with so many things that can go wrong. Also, it will be large so it can present lots of neat data to the stats connoisseur. ps, I like this term much better than "stats-ho" ;)

Second, it is generally assumed that 3rd party stats relieve bandwidth demands on the project. Users increasingly visit the 3rd party stats instead of the basic project pages. Basically, the cooler stats pages are, the more hits they are going to draw. Having rather plain project stats pages is all we need for 3rd party stats, the magic comes from analysis of data over time. And plain project stats pages help send people over to those private stats servers, with their own bandwidth. I expect that having heavy stats available on the project servers will result in more people finding those pages. More visits, more page hits per visit, and generally larger pages being served up.

Third, there is another good reason to have plain pages on the server. Our pages contain a multitude of links leading back into the page. Column sort links, links to the next member's Personal page, alternate team links, page modification links, etc. Plain project pages have none of these things.

Several days ago, a Harvester hit our server and got stuck in the Free-DC dFold section. Personal pages, to be specific. It followed every link. It tried and Tried and TRIED to find emails for Free-DC members. After about 40 hours, we firewalled it off, but it had certainly saturated the box or our upload bandwidth during some of that time. 7K hits and 295 MB of page files, and it did not even find a single email for all that trouble. The only email in our pages is the team contact link, and that one is ASCII-encrypted, current harvesters cannot read it.

Today's plain project stats pages would be loaded once, and left alone.

We will be passing on the offer at this time, but I'd recommend great caution in allowing anyone to use it. Seems like it brings too many problems, along with tons of bandwidth demands.

Dyyryath

01-23-2003, 01:05 AM

Originally posted by reader50
It tried and Tried and TRIED to find emails for Free-DC members. After about 40 hours, we firewalled it off, but it had certainly saturated the box or our upload bandwidth during some of that time.

Well, we're a very popular bunch of people, you know. ;) :D

All joking aside, reader50 has brought up some valid points. He's probably right that hosting 3rd party stats isn't a good idea, but it sure was a cool thing for them to offer.

Actions like those just keep me firmly rooted here at DF. :thumbs:

HaloJones

01-23-2003, 06:05 AM

Please correct me if I'M wrong, but no attempt was ever made to equalize stats since the begining of the DF project. I.E., any given machine, running since the inception of DF, might have folded 4500 folds per hour, on one target. 11,000 fold per hour, on the next target. And 3,200 fold per hour on another target.

And at the risk of furthering the "ad nauseam" the above quote from FBK is the heart of this issue. The project has never tried to have a consistent scoring approach. Unlike Stanford which has different scores for each protein, DF gives a point per structure. Some are quick, some are slow. When the new client came in that re-analysed already analysed proteins and was therefore much faster, did anyone complain that it was unfair? Not that I saw.

If the new method awards twice as many points as now, it won't be any different from what has gone before. The scoring method has never been "fair" so leave it as it is, please.

Grumpy

01-23-2003, 07:55 PM

I am with you HaloJones, less than 30% of people in OCworkbench who responded said they wanted the Stats set to 0. Just leave it as it is and see how it goes, if people think it is too out of whack then we will see what can be done. But no one has ever complained seriously before....

pointwood

01-24-2003, 03:42 AM

Well, there is a difference between DF and F@H in that we all crunch on the same protein all the time. On F@H you can get several different types that takes various lengths of time to crunch.

If I had to choose, I think I would reset the stats, mainly because what I find fun is the race and competition and currently we are so far ahead that we just have no competetion. A reset would give all other teams a big oppertunity to ramp up and give try to compete with us.

Michael H.W. Weber

01-24-2003, 05:46 AM

Stats, stats, stats - I read a lot about stats.

The more important question is: With the new DF algorithm, will results be uploadable from any computer regardless where they have been calculated or not? If not, many members from our team will have to say good-bye to this project.

After the CASP5 results have now become available, one can conclude that this project has produced moderately good results. Moreover, as I suspected above, the announced changes in the algorithm are a direct result of this CASP5 result feedback. Hence, the current algorithm has proven not to be sufficient to continue with it. As a consequence, I would like to ask how long it will approx. take until the new algorithm will be in place as I don't see much reason to continue computing with this one.

All the best,
Michael.

P.S.: Don't get me wrong. Without the past efforts in supporting this project we would never have learned how to improve it. I am looking forward to the new approach hoping that the "upload problem" will be solved. :D

runestar

01-24-2003, 06:54 AM

Howard,

Can we run two stats engines? Say the main stats will zero phase I , and be phase two counting only, but old stats will still be available. The secondary stats engine donated by someone would do the combo of the two projects. And to help offload the extra bandwidth, mirror the secondary stats engines at sites people are willing to contributed some space and the bandwidth to host it?

One question though before anybody comments on that... I haven't seen anybody really ask this. From what you are saying it seems the scoring system will be completely different for Phase II. It seems that it will be point valued based, but Phase I is structure based. So how do you mix those two? 6 oranges + 3 apples = 9 fruit? Does that really mean anything mixing them?

Personally I am in favor of only counting Phase II work once we begin that... but my question remains regardless of that. Is no one else seeing this? Am I just totally missing something here?

Best,

RuneStar½

P.S. I'm running the screensaver on both my systems, Howard. =)

Grumpy

01-24-2003, 07:30 AM

Agreed, to lose the ability to upload remote computers output would be a hard blow. Contrary to all reports, money is easy to find for research, it is the human resources that are hard to find. Money cannot magically produce people who can do what the folding community provides. We may be a bunch of cranky pants, but we are productive little cranky pants :p

I just can't wait to see the new client in action, and just double my medication if they zero the stats :crazy:

Welnic

01-24-2003, 08:48 AM

Originally posted by Brian the Fist

< snip >

- as a side effect, each machine will need to be uniquely identified so it will be not be easy to, say, generate on one machine and then upload from another - this is necessary to avoid other nasty potential problems but should not affect people you use proxy servers or firewalls, only physically moving around data will cause trouble.
- you will at most be able to buffer 50 generations (about 2 days work (maybe)) but this could change

Ill add more stuff as I think of it. Comments and complaints are welcome. Keep in mind some concessions will be necessary to get this new more complicated algorithm to work, which may include alienating some users with special needs.

EDIT:

Actually, I think we can do it without those last two points, so scratch those off

So Howard, I guess after all the mention of those last two points, the next time that you edit you would just scratch those off yourself instead of telling us to. :D

pointwood

01-24-2003, 11:35 AM

Originally posted by FBK
As an "Old User", I think that I would prefer my stats to be worth less, as opposed to my old stats being worth nothing at all. I don't think your stats suddenly become nothing worth, but I know what you mean. I personally think your stats are worth just as much as they where before. Actually to me and the rest of Team Stir Fry, I think they are worth more since if Howard decides to reset the stats. We can forever claim to have "won" the first phase. I don't care that much about that though since it's not about who have crunched most, it's about who got the best structure :)

Please correct me if I'M wrong, but no attempt was ever made to equalize stats since the begining of the DF project. I.E., any given machine, running since the inception of DF, might have folded 4500 folds per hour, on one target. 11,000 fold per hour, on the next target. And 3,200 fold per hour on another target. Good point :)

runestar

01-24-2003, 02:57 PM

Well, there has been no point value assigned to the stats. Its just been a raw number. Then again, we've been doing brute forcing primarily of the structures so there has been a whole lot of point of any kind of value system.

RS½

jkeating

01-26-2003, 03:00 PM

Originally posted by m0ti
... In the meantime, there's always dfDetect (http://t2.technion.ac.il/~sm0ti/dfDetect.zip) which is a little Win32 utility I made (by team-mate request's) that does some worthwhile stuff:

- works for DF installed as a service (or multiple services) or CLI.
- make sure DF is always running
- run DF in completely hidden mode (useful for Win9x users)
- restart DF after X minutes have passed
- stop DF/keep DF from running while program X is running (great for corporate farmers with CAD machines/gamers).

In any case, I'll probably be back doing the dfQ thing soon (I hope). February perhaps will see a relase (I was originally hoping for January, but, life got in the way ;)).

m0ti,

I tried to download dfDetect but I got an error message saying that the server was unavailable :confused:

I would also be very interested in dfQ... I've got access to about 30 pcs that don't have Internet access... :eek: I sure would like to keep them busy on a great project. :D