Page 1 of 3 123 LastLast
Results 1 to 40 of 84

Thread: SB's stats: open discussion

  1. #1

    SB's stats: open discussion

    I'd like to get all of your ideas and suggestions on how to improve SB's project statistics. Personally, I think we have among the best stats of any distributed project yet -- but I'd like to make them even better!

    Here are some of the ideas that I've heard mentioned, either by Louie or by users on the forum.

    1) Higher-resolution user stats, i.e. actually trying to get data on how much CPU time the client was getting when, instead of simply smoothing the work over the entire lifespan of the test. (I have my doubts that this is feasible, but if a lot of you beg, I'll really put some hard thought into how we might be able to do it.)

    2) Breakdown of production/rate/etc. by machine type, giving a table of the most productive machines or operating systems and so on.

    3) Breakdown of production/rate/etc. by country, using a reverse-lookup of the IP addresses. This would give a neat table on which countries do the most work for SB, and might inspire some friendly, competitive nationalism in people.

    4) Take-over times for user and team rankings, i.e. an indicator of "how long it will be at your current rate to overtake the person ahead of you in the rankings."

    Everyone's encouraged to share their ideas, as I'm open-minded. Help us make our project not only the most powerful, but the most FUN distributed project yet!

  2. #2
    Member
    Join Date
    Oct 2002
    Location
    Copenhagen.
    Posts
    43
    I think the graphs should be more colorfull... different colors for each IP-adress returning results in the personal graphs; one color for each team member in the team graphs, and one color for each sequence of numbers in the general graph. But I guess I mentioned that one before! Along with publishing of what numbers each member have processed, and which are pending; - in raw data.

    Mevs Tarje

  3. #3
    If i had only one request, it would be to replace cEMs/sec with something that was equal for all values of n and k. I don't know what would be feasible, however (just something better than processing time, or i'll fire up all my old 486's lol).

  4. #4
    Senior Member eatmadustch's Avatar
    Join Date
    Nov 2002
    Location
    Switzerland
    Posts
    154
    what would be nice is to get awards (as in seti@home and theneoproject). There you get awards once you've completed a certain amount of work units. Here you could get awards for completing a certain amount of proth tests or cEMs. What would also be nice is to be able to see from the personal stats if you've found a prime or not, if so which one, when, how long it took to compute ... and so on.
    another idea is to see how many and what kind of processors a user has (TheNeoProject also has this). This would require a new version of sob, I guess!
    but I agree with you ... sob has the best stats by far!
    EatMaDust


    Stop Microsoft turning into Big Brother!
    http://www.againsttcpa.com

  5. #5
    Senior Member
    Join Date
    Dec 2002
    Location
    Madrid, Spain
    Posts
    132
    Originally posted by eatmadustch
    what would be nice is to get awards (as in seti@home and theneoproject). There you get awards once you've completed a certain amount of work units. Here you could get awards for completing a certain amount of proth tests or cEMs.
    I think that will only bring more cheating. The real awards are the primes.

  6. #6
    A few things:

    1) I've already implemented one of the ideas I mentioned in my first post. A breakdown of work, rate, and so on by country is now available at this URL:

    http://www.seventeenorbust.com/stats/byCountry.mhtml

    2) Regarding cEMs, they are *definitely* going to be changed to a more suitable unit. This was actually on our list of things to do before putting the new system online, but we had to accelerate our plans due to being slashdotted and so on.

    3) About prizes: we can't do it. That is to say, unless we are able to find a sponsor willing to support prizes, which is possible. Louie and I have no financial backing whatsoever for this project; we don't get paid; it's purely something we've done in our own time as a pet project. This is fine, because we really enjoy it and never wanted to get paid. But it also means that we're too poor to have much to throw around in the way of glitzy prizes.

    I'll get back to you on the other ideas that have been posted so far. I like the discussion, and I'm glad everybody's sharing their thoughts. Keep it up!

  7. #7
    Originally posted by kugano
    3) About prizes: we can't do it. That is to say, unless we are able to find a sponsor willing to support prizes, which is possible. Louie and I have no financial backing whatsoever for this project; we don't get paid; it's purely something we've done in our own time as a pet project. This is fine, because we really enjoy it and never wanted to get paid. But it also means that we're too poor to have much to throw around in the way of glitzy prizes.
    I, for one, am totally against prizes. If i had the chance at financial gain by running the client, I'd have to stop running the client on several processors. Personally, knowing that the project is advancing mathematical research is good enough motivation for me.

  8. #8
    Originally posted by kugano
    1) I've already implemented one of the ideas I mentioned in my first post. A breakdown of work, rate, and so on by country is now available at this URL:

    http://www.seventeenorbust.com/stats/byCountry.mhtml
    Am I really 85% of Canada? Wow.

  9. #9
    Junior Member
    Join Date
    Dec 2002
    Location
    Kemah, Texas
    Posts
    18
    from Paulie at ars...

    "Would you add a suggestion over there for me: Total test to run until project is over (maybe include how many cEMs needed to clear everything, then a compare to current rate to get the projected completed date), even assuming that there are no primes to be had in the remaining groups. Then adjust as each K is cleared."

  10. #10
    Junior Member
    Join Date
    Dec 2002
    Location
    Kemah, Texas
    Posts
    18
    Talking about prizes...

    I think he was speaking in terms of a paper certificate similar to SETI...something you can print and frame...

    although a lot of people make think it a silly idea, I wonder...I'm looking at a office wall that could have a nicely done certificate for SETI, ECCP-109, DPad, ECC2 neatly framed in a classic thin black...
    all the docs and lawyers can have all their papers hung with care (has anyone actually got close enough to see what they really are?)...why can't we?

    A very long time ago I had this card from somewhere is Russia that I was really very proud of...were they called SQL cards? (it was a long time ago and I don't remember so well) People listened to a shortwave radio and collected them...

    personally, I'm tickled pink to think that I'm helping find some of the largest prime numbers known to man-kind and having a framed piece of paper that said so hanging on the wall would be actually kinda neat...hey, you guys have put up the classiest looking DC web sight ever, I'd love to see you design a certificate...

  11. #11
    Senior Member eatmadustch's Avatar
    Join Date
    Nov 2002
    Location
    Switzerland
    Posts
    154
    Originally posted by kugano


    About prizes: we can't do it. That is to say, unless we are able to find a sponsor willing to support prizes, which is possible. Louie and I have no financial backing whatsoever for this project; we don't get paid; it's purely something we've done in our own time as a pet project. This is fine, because we really enjoy it and never wanted to get paid. But it also means that we're too poor to have much to throw around in the way of glitzy prizes.
    I never said I wanted prizes ... I have enough motivation thinking I could be in all the math books on prime numbers

    I said I wanted "awards" and I guess I didn't quite say that so everyone understood. They are called Certificates of Appreciation in TheNeoProject (go to the stats page then on the leader, then you'll see that he has several Certificates of Appreciation with lots of nice fractals). In Seti@home they are called Workunit Certificates. You can see what they are at the bottom of http://setiathome.ssl.berkeley.edu/f...user_stats_new (one of the leaders)
    EatMaDust


    Stop Microsoft turning into Big Brother!
    http://www.againsttcpa.com

  12. #12
    Sieve it, baby!
    Join Date
    Nov 2002
    Location
    Potsdam, Germany
    Posts
    959
    How accurate is the reserve lookup when it comes to identifying the country? Are there databases which assign the proper countries to certain nets? Or is there additional infoformation provided by the lookup?

    Another option would be that everyone inserts this info in his preferences - maybe with an option whether or not they want this shown on the website.
    There we have the advantage that the country the person works for it accounted, not the country of the subnet the PC is in - usually this should be the same, but there are some exceptions (was it smh? )...

    CPU type / OS distribution would be the next important item on the list for me...

  13. #13
    Originally posted by Mystwalker
    How accurate is the reserve lookup when it comes to identifying the country?
    From the new page
    Beware: these statistics are only estimated. They are based on the two-letter country code of the IP address from where the work was submitted. Since some IP addresses don't resolve properly to hostnames, and some hostnames' country codes are not really the country where the IP is located, there is inherently some error in these numbers. Also, it should be noted that non-country-code domains like .com, .net, .edu and so on are included in the United States' totals, since most US domains end in these instead of .us.
    Originally posted by Mystwalker
    Are there databases which assign the proper countries to certain nets?
    Yes. Freely available, i'm not so sure.

    Or is there additional infoformation provided by the lookup?
    Maybe something can be built with ARIN whois data.

  14. #14
    When you do a reverse DNS lookup on an IP address, you get a hostname like "foo-bar-27.wanadoo.fr". The by-country rank script extracts the top level domain (TLD) "fr" from the address, consults an internal lookup table of all the country codes, and tallies that address up to the totals for France.

    This works reasonably well for every country but the United States. Even though there is a "us" TLD, its use is mostly restricted to government or educational use. The script just assumes that "com", "edu", "net", "org", "gov", "mil" and anything else that isn't a country code is probably from the United States. I suspect the US's totals are probably artificially high because of this, but probably not by too much (certainly not enough for it to drop from the #1 spot).

    There probably is a better way to get more accurate country information from ARIN, RIPE and so on, but it would probably be a lot slower and a LOT more programming. If anybody knows of easier ways to do this, I'm all ears.

    And, to Mystwalker, yep: OS/machine-type breakdowns are now at the top of the list for new stats to implement.

  15. #15
    Regarding 'prizes': You're right! I completely misread your original post. My apologies! I do like the idea of some sort of 'award' for prime discoveries. A serious possibility that Louie and I actually considered, but never actually did, was to send the winning participants oversized posters with the SB logo and the prime's decimal expansion on the poster.

    I made such a poster (measuring 3 feet tall by just over 6 feet wide, currently hanging along a wall in my bedroom), and I must say it's quite impressive.

    A hugely-scaled-down image of this poster is here:

    http://maximus.cvm.uiuc.edu/composite.gif

    (Scroll to the right to see the logo and descriptive text. This image is something like 3% the scale of the actual image used to plot the poster at 300 dpi.)

    The reason we never went through with this is that plotter time for these posters isn't easy to come by. I have to suck up to my boss at the University so that she won't mind the massive ink drain on the plotters. But, if we can come up with some arrangement to get the posters more easily, or get someplace like Kinko's to "sponsor the project" by giving us discounts on plots, ... well, we'll have to see.

  16. #16
    Senior Member eatmadustch's Avatar
    Join Date
    Nov 2002
    Location
    Switzerland
    Posts
    154
    wow, I would love one of those posters ... gotta find more machines to run sob on
    EatMaDust


    Stop Microsoft turning into Big Brother!
    http://www.againsttcpa.com

  17. #17
    Use subnetting to find it out. Each country has its own unique range of IP addresses. Get the IP of the submitter and check what range it is from. Using DNS is quite unreliable.

  18. #18
    Why not make it possible to enter your country in the preferences? If nothing is in there you could always do a reverse lookup

  19. #19
    Downsized Chinasaur's Avatar
    Join Date
    Dec 2001
    Location
    WA Wine Country
    Posts
    1,847
    Kugano,

    If you have a PayPal account I'll give a dollar or two for the posters. If a few hundred gave a dollar or so...it would more than pay for a nice Kinko poster.
    Agent Smith was right!: "I hate this place. This zoo. This prison. This reality, whatever you want to call it, I can't stand it any longer. It's the smell! If there is such a thing. I feel saturated by it. I can taste your stink and every time I do, I fear that I've somehow been infected by it."

  20. #20
    Originally posted by Chinasaur
    Kugano,

    If you have a PayPal account I'll give a dollar or two for the posters. If a few hundred gave a dollar or so...it would more than pay for a nice Kinko poster.
    Yes, i can forward a couple bucks also, to a paypal account.

  21. #21
    Member
    Join Date
    Dec 2002
    Location
    new york
    Posts
    76
    I miss the larger graphs. In fact, anything to do with graphs is cool. If they're generated on the fly, it would be nice if the viewer could enter a time range.

    Also, it seems there are two ways to configure your stats, either run all machines under a single username or make yourself a team and run each machine as its own user. Really we should go 3 levels deep (team-->user-->machine).

    This makes the data model more complex, but you could put the machine stats on the user page.

  22. #22
    It would be great if we can see which numbers are in progress, which are done, at what time etc,...

    Or give us the old graphs per K back, at least that gave an idea how many exponents below a certain breakpoint are still pending.

  23. #23
    I'd like to see some sort of change to the overall rate stat. I dabbled in the project in it's early days then completely stopped running it for quite some time. It's only in the last couple of months that I've started seriously running it. But now I'm ranked in the 500s for overall production (and will probably stay there unless I find a supercomputer to run the client on ) because of all those months of no activity.

  24. #24
    Senior Member dmbrubac's Avatar
    Join Date
    Dec 2002
    Location
    Ontario Canada
    Posts
    112

    Only 2 Canadians?

    Originally posted by shifted
    Am I really 85% of Canada? Wow.
    Well I appear to be the other 15%, which is a bit disappointing. Are there realy only 2 Canadians working on this.

    BTW I would like to see trend indicators on the per country stats, just like on the individual stats.

    Since shifted and I are wondering about per country participation, could we also treat countries like teams and see the other members/citizens and their production?

  25. #25
    Dungeon Master alpha's Avatar
    Join Date
    Mar 2002
    Location
    Norfolk, UK
    Posts
    1,700
    Originally posted by smh
    Why not make it possible to enter your country in the preferences?
    I second this. I always submit work from a .com host but am located in the UK.

  26. #26
    Originally posted by alpha
    I second this. I always submit work from a .com host but am located in the UK.
    I third this

  27. #27
    While I admit the DNS solution isn't perfect, I'm not sure I'm ready to allow users to choose their country in their preferences either. The reason is that, while certainly no one here falls into this category, there are a lot of dishonest people out there who would choose a country other than their actual country. The by-country page would then become a popularity contest, and that's not what they're meant to be at all.

    I'm going to look into the idea someone suggested earlier on the forum about using IANA/ARIN/RIPE/etc. data to determine country ... someone, somewhere, has to have come up with a database mapping IP ranges to countries.

  28. #28
    Any RPSL compliant inetnum object should have a country field containing the 2-letter ISO-3166 country code - but bear in mind that the object data may have been edited by error-prone humans (like me). Unfortunately, there's no guarantee that ARIN data will be as standard as RADB/RIPE/APNIC/CW etc

    A query to whois with the flag to search any database will cover most bases, but would generate a LOT of lookups - you should then either cache answers, or probably run your own IRRd with live updates from RADB/RIPE etc. This may be more involvement than you're looking for....

    Using the RIPE whois client, the following pipeline would give
    you approx what you're after:

    whois -a -r -T inetnum <ipaddress> | egrep '^country:'

    Further pointers available upon request.

  29. #29
    Excellent! I'm looking into automating this now. Is there a way, as far as you know, to restrict the whois results to authoritative entries only? Non-authoritative sources return (incorrect) country codes, and at the moment I can't think of any way to algorithmically determine which of the several entries is authoritative (a human can do it by reading the comments, but Perl ... ?)

  30. #30
    I would like to see the individual graphs being correct.

    I have accessed the site multiple times and see my production plummeting on the graph. I then run to my machines and find out nothing is wrong and they are all running full bore.

    Why do the graphs show incorrect info??


    I'd also like to see the individual pages updated more than once a day.
    outlnder
    *************
    Team Prime Rib

  31. #31
    To choose which of several inetnums, choose the most specific one - ie. the one with the longest prefix eg. a /29 (8 address block) wins over a /24 (256 address block - a class C network). In simplistic terms, an easy algorithm to calculate the length of the prefix would be to subtract the first IP address from the second IP address whilst treating both as 32 bit unsigned values (use inet_aton for this), and add 1 to the result. The result should be a power of 2, each representing a further increase/decrease in the prefix length by one bit. Any result which is not a power of 2 will be an invalid entry. In these terms, you would choose the inetnum with the smallest result as your preferred value.

    If you search on google for iso3166, you'll find things like http://nl.ijs.si/gnusl/cee/std/ISO_3166.html. The column of 2 letter codes is the one we care about. With one exception that I know of, if a value from the whois lookup is not a case-insensitive match for an item from this column - it is invalid and should be ignored. The one exception is UK, which is in use as a country code and as a TLD, although the 'correct' value should have been GB. Treat UK as a synonym for GB and you'll get the right answers.

    Hope this is useful.

  32. #32
    There seem to be some inconsistentcies with the uper bound number that has been tested, and the lowest number under test for a given K.

    is this because some tests were manually added?

  33. #33
    It's because of dropped (or, actually, "late") tests: there may still be tests out for lower n values which have already been returned by other users. So the upper bound climbs while the "min n under test" remains low, since the test is still out to some user (even though another user already reported it).

  34. #34
    Senior Member Supp's Avatar
    Join Date
    Dec 2001
    Location
    Czechia, EU
    Posts
    558
    Originally posted by kugano
    While I admit the DNS solution isn't perfect, I'm not sure I'm ready to allow users to choose their country in their preferences either. The reason is that, while certainly no one here falls into this category, there are a lot of dishonest people out there who would choose a country other than their actual country. The by-country page would then become a popularity contest, and that's not what they're meant to be at all....
    Well, I think that most people are proud of country they live in & crunch for so "popularity contest" is very unlikely and you wouldn't have to dig through all that WHOIS stuff.

    Just and opinion, though.
    rm -Rf /

  35. #35
    Senior Member eatmadustch's Avatar
    Join Date
    Nov 2002
    Location
    Switzerland
    Posts
    154
    I don't know. I think a lot of people wouldn't take it seriously and enter stuff like andorra, lichtenstein, vatican city (all extremely small countries) and other "silly stuff" like that! Or, a lot of "anti-americans" would enjoy entering iraq and afganistan!
    EatMaDust


    Stop Microsoft turning into Big Brother!
    http://www.againsttcpa.com

  36. #36
    Originally posted by eatmadustch
    I don't know. I think a lot of people wouldn't take it seriously and enter stuff like andorra, lichtenstein, vatican city (all extremely small countries) and other "silly stuff" like that! Or, a lot of "anti-americans" would enjoy entering iraq and afganistan!
    Very good point. I'd stick with the whois stuff as it's reasonably accurate now and already implemented.

  37. #37

    small nit in stats display

    I currently have this in the stats display for my team:

    40 DeathAxe 1000.00 M cEMs 8.32 K cEMs/sec 7.03 K cEMs/sec

    1000.00 M is 1.00 G of course

    Just thought I'd mention it

  38. #38

    another stats nit

    I was happily surprised that TeamRetro jumped from #5 to #2 on 'last days rate' (and got the blue up arrow)... I was even more surprised when I looked at today's update and found the nice blue arrow still there, but we are still at #2 Probably something simple to fix. as I'm sure TeamPrimeRib would not want to get caught TOOO soon

  39. #39
    Junior Member
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2

    Re: Only 2 Canadians?

    Originally posted by dmbrubac
    Well I appear to be the other 15%, which is a bit disappointing. Are there realy only 2 Canadians working on this.

    Between shovelling snow, sharpening my skates, and eating back bacon, I'm also a Canadian and judging by the current stats, I believe I will have you and shifted begging for mercy in a few weeks time, I just need to catch up...



    EggmanEEA

  40. #40
    Senior Member dmbrubac's Avatar
    Join Date
    Dec 2002
    Location
    Ontario Canada
    Posts
    112
    That post is from before the country stats where changed. I believe Louie had lumped all the .com, .org, etc into the US stats. They took a big shift a couple weeks ago and now it is clear I am NOT one of only two Canucks.

    I feel much better now.....

Page 1 of 3 123 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •