PDA

View Full Version : SB's stats: open discussion



kugano
12-16-2002, 01:57 AM
I'd like to get all of your ideas and suggestions on how to improve SB's project statistics. Personally, I think we have among the best stats of any distributed project yet -- but I'd like to make them even better!

Here are some of the ideas that I've heard mentioned, either by Louie or by users on the forum.

1) Higher-resolution user stats, i.e. actually trying to get data on how much CPU time the client was getting when, instead of simply smoothing the work over the entire lifespan of the test. (I have my doubts that this is feasible, but if a lot of you beg, I'll really put some hard thought into how we might be able to do it.)

2) Breakdown of production/rate/etc. by machine type, giving a table of the most productive machines or operating systems and so on.

3) Breakdown of production/rate/etc. by country, using a reverse-lookup of the IP addresses. This would give a neat table on which countries do the most work for SB, and might inspire some friendly, competitive nationalism in people.

4) Take-over times for user and team rankings, i.e. an indicator of "how long it will be at your current rate to overtake the person ahead of you in the rankings."

Everyone's encouraged to share their ideas, as I'm open-minded. Help us make our project not only the most powerful, but the most FUN distributed project yet!

Firebirth
12-16-2002, 05:12 AM
I think the graphs should be more colorfull... different colors for each IP-adress returning results in the personal graphs; one color for each team member in the team graphs, and one color for each sequence of numbers in the general graph. But I guess I mentioned that one before! Along with publishing of what numbers each member have processed, and which are pending; - in raw data.

Mevs Tarje

shifted
12-16-2002, 05:40 AM
If i had only one request, it would be to replace cEMs/sec with something that was equal for all values of n and k. I don't know what would be feasible, however (just something better than processing time, or i'll fire up all my old 486's lol).

eatmadustch
12-16-2002, 11:05 AM
what would be nice is to get awards (as in seti@home and theneoproject). There you get awards once you've completed a certain amount of work units. Here you could get awards for completing a certain amount of proth tests or cEMs. What would also be nice is to be able to see from the personal stats if you've found a prime or not, if so which one, when, how long it took to compute ... and so on.
another idea is to see how many and what kind of processors a user has (TheNeoProject also has this). This would require a new version of sob, I guess!
but I agree with you ... sob has the best stats by far!

Troodon
12-16-2002, 11:59 AM
Originally posted by eatmadustch
what would be nice is to get awards (as in seti@home and theneoproject). There you get awards once you've completed a certain amount of work units. Here you could get awards for completing a certain amount of proth tests or cEMs.

I think that will only bring more cheating. The real awards are the primes.

kugano
12-16-2002, 12:14 PM
A few things:

1) I've already implemented one of the ideas I mentioned in my first post. A breakdown of work, rate, and so on by country is now available at this URL:

http://www.seventeenorbust.com/stats/byCountry.mhtml

2) Regarding cEMs, they are *definitely* going to be changed to a more suitable unit. This was actually on our list of things to do before putting the new system online, but we had to accelerate our plans due to being slashdotted and so on.

3) About prizes: we can't do it. That is to say, unless we are able to find a sponsor willing to support prizes, which is possible. Louie and I have no financial backing whatsoever for this project; we don't get paid; it's purely something we've done in our own time as a pet project. This is fine, because we really enjoy it and never wanted to get paid. But it also means that we're too poor to have much to throw around in the way of glitzy prizes. ;)

I'll get back to you on the other ideas that have been posted so far. I like the discussion, and I'm glad everybody's sharing their thoughts. Keep it up!

shifted
12-16-2002, 12:18 PM
Originally posted by kugano
3) About prizes: we can't do it. That is to say, unless we are able to find a sponsor willing to support prizes, which is possible. Louie and I have no financial backing whatsoever for this project; we don't get paid; it's purely something we've done in our own time as a pet project. This is fine, because we really enjoy it and never wanted to get paid. But it also means that we're too poor to have much to throw around in the way of glitzy prizes. ;)

I, for one, am totally against prizes. If i had the chance at financial gain by running the client, I'd have to stop running the client on several processors. Personally, knowing that the project is advancing mathematical research is good enough motivation for me.

shifted
12-16-2002, 12:21 PM
Originally posted by kugano
1) I've already implemented one of the ideas I mentioned in my first post. A breakdown of work, rate, and so on by country is now available at this URL:

http://www.seventeenorbust.com/stats/byCountry.mhtml

Am I really 85% of Canada? Wow.

dfamily
12-16-2002, 01:12 PM
from Paulie at ars...

"Would you add a suggestion over there for me: Total test to run until project is over (maybe include how many cEMs needed to clear everything, then a compare to current rate to get the projected completed date), even assuming that there are no primes to be had in the remaining groups. Then adjust as each K is cleared."

dfamily
12-16-2002, 01:30 PM
Talking about prizes...

I think he was speaking in terms of a paper certificate similar to SETI...something you can print and frame...

although a lot of people make think it a silly idea, I wonder...I'm looking at a office wall that could have a nicely done certificate for SETI, ECCP-109, DPad, ECC2 neatly framed in a classic thin black...
all the docs and lawyers can have all their papers hung with care (has anyone actually got close enough to see what they really are?)...why can't we?

A very long time ago I had this card from somewhere is Russia that I was really very proud of...were they called SQL cards? (it was a long time ago and I don't remember so well) People listened to a shortwave radio and collected them...

personally, I'm tickled pink to think that I'm helping find some of the largest prime numbers known to man-kind and having a framed piece of paper that said so hanging on the wall would be actually kinda neat...hey, you guys have put up the classiest looking DC web sight ever, I'd love to see you design a certificate...

eatmadustch
12-16-2002, 01:31 PM
Originally posted by kugano


About prizes: we can't do it. That is to say, unless we are able to find a sponsor willing to support prizes, which is possible. Louie and I have no financial backing whatsoever for this project; we don't get paid; it's purely something we've done in our own time as a pet project. This is fine, because we really enjoy it and never wanted to get paid. But it also means that we're too poor to have much to throw around in the way of glitzy prizes. ;)


I never said I wanted prizes ... I have enough motivation thinking I could be in all the math books on prime numbers :D

I said I wanted "awards" and I guess I didn't quite say that so everyone understood. They are called Certificates of Appreciation in TheNeoProject (go to the stats page then on the leader, then you'll see that he has several Certificates of Appreciation with lots of nice fractals). In Seti@home they are called Workunit Certificates. You can see what they are at the bottom of http://setiathome.ssl.berkeley.edu/fcgi-bin/fcgi?email=lmester@access.k12.wv.us&cmd=user_stats_new (one of the leaders)

Mystwalker
12-16-2002, 01:57 PM
How accurate is the reserve lookup when it comes to identifying the country? Are there databases which assign the proper countries to certain nets? Or is there additional infoformation provided by the lookup?

Another option would be that everyone inserts this info in his preferences - maybe with an option whether or not they want this shown on the website.
There we have the advantage that the country the person works for it accounted, not the country of the subnet the PC is in - usually this should be the same, but there are some exceptions (was it smh? ;) )...

CPU type / OS distribution would be the next important item on the list for me... :thumbs:

shifted
12-16-2002, 02:04 PM
Originally posted by Mystwalker
How accurate is the reserve lookup when it comes to identifying the country?

From the new page
Beware: these statistics are only estimated. They are based on the two-letter country code of the IP address from where the work was submitted. Since some IP addresses don't resolve properly to hostnames, and some hostnames' country codes are not really the country where the IP is located, there is inherently some error in these numbers. Also, it should be noted that non-country-code domains like .com, .net, .edu and so on are included in the United States' totals, since most US domains end in these instead of .us.


Originally posted by Mystwalker
Are there databases which assign the proper countries to certain nets?

Yes. Freely available, i'm not so sure.


Or is there additional infoformation provided by the lookup?

Maybe something can be built with ARIN whois data.

kugano
12-16-2002, 02:24 PM
When you do a reverse DNS lookup on an IP address, you get a hostname like "foo-bar-27.wanadoo.fr". The by-country rank script extracts the top level domain (TLD) "fr" from the address, consults an internal lookup table of all the country codes, and tallies that address up to the totals for France.

This works reasonably well for every country but the United States. Even though there is a "us" TLD, its use is mostly restricted to government or educational use. The script just assumes that "com", "edu", "net", "org", "gov", "mil" and anything else that isn't a country code is probably from the United States. I suspect the US's totals are probably artificially high because of this, but probably not by too much (certainly not enough for it to drop from the #1 spot).

There probably is a better way to get more accurate country information from ARIN, RIPE and so on, but it would probably be a lot slower and a LOT more programming. If anybody knows of easier ways to do this, I'm all ears. ;)

And, to Mystwalker, yep: OS/machine-type breakdowns are now at the top of the list for new stats to implement. :)

kugano
12-16-2002, 02:32 PM
Regarding 'prizes': You're right! I completely misread your original post. My apologies! I do like the idea of some sort of 'award' for prime discoveries. A serious possibility that Louie and I actually considered, but never actually did, was to send the winning participants oversized posters with the SB logo and the prime's decimal expansion on the poster.

I made such a poster (measuring 3 feet tall by just over 6 feet wide, currently hanging along a wall in my bedroom), and I must say it's quite impressive.

A hugely-scaled-down image of this poster is here:

http://maximus.cvm.uiuc.edu/composite.gif

(Scroll to the right to see the logo and descriptive text. This image is something like 3% the scale of the actual image used to plot the poster at 300 dpi.)

The reason we never went through with this is that plotter time for these posters isn't easy to come by. I have to suck up to my boss at the University so that she won't mind the massive ink drain on the plotters. But, if we can come up with some arrangement to get the posters more easily, or get someplace like Kinko's to "sponsor the project" by giving us discounts on plots, ... well, we'll have to see. ;)

eatmadustch
12-16-2002, 02:47 PM
wow, I would love one of those posters ... gotta find more machines to run sob on :)

cjohnsto
12-16-2002, 04:24 PM
Use subnetting to find it out. Each country has its own unique range of IP addresses. Get the IP of the submitter and check what range it is from. Using DNS is quite unreliable.

smh
12-16-2002, 05:35 PM
Why not make it possible to enter your country in the preferences? If nothing is in there you could always do a reverse lookup

Chinasaur
12-16-2002, 07:39 PM
Kugano,

If you have a PayPal account I'll give a dollar or two for the posters. If a few hundred gave a dollar or so...it would more than pay for a nice Kinko poster.

shifted
12-16-2002, 07:42 PM
Originally posted by Chinasaur
Kugano,

If you have a PayPal account I'll give a dollar or two for the posters. If a few hundred gave a dollar or so...it would more than pay for a nice Kinko poster.

Yes, i can forward a couple bucks also, to a paypal account.

dudlio
12-17-2002, 12:38 AM
I miss the larger graphs. In fact, anything to do with graphs is cool. If they're generated on the fly, it would be nice if the viewer could enter a time range.

Also, it seems there are two ways to configure your stats, either run all machines under a single username or make yourself a team and run each machine as its own user. Really we should go 3 levels deep (team-->user-->machine).

This makes the data model more complex, but you could put the machine stats on the user page.

smh
12-17-2002, 05:21 AM
It would be great if we can see which numbers are in progress, which are done, at what time etc,...

Or give us the old graphs per K back, at least that gave an idea how many exponents below a certain breakpoint are still pending.

mordrid52
12-18-2002, 04:15 PM
I'd like to see some sort of change to the overall rate stat. I dabbled in the project in it's early days then completely stopped running it for quite some time. It's only in the last couple of months that I've started seriously running it. But now I'm ranked in the 500s for overall production (and will probably stay there unless I find a supercomputer to run the client on :D) because of all those months of no activity.

dmbrubac
12-18-2002, 04:33 PM
Originally posted by shifted
Am I really 85% of Canada? Wow.

Well I appear to be the other 15%, which is a bit disappointing. Are there realy only 2 Canadians working on this.

BTW I would like to see trend indicators on the per country stats, just like on the individual stats.

Since shifted and I are wondering about per country participation, could we also treat countries like teams and see the other members/citizens and their production?

alpha
12-18-2002, 04:57 PM
Originally posted by smh
Why not make it possible to enter your country in the preferences?

I second this. I always submit work from a .com host but am located in the UK.

shifted
12-18-2002, 07:07 PM
Originally posted by alpha
I second this. I always submit work from a .com host but am located in the UK.

I third this :)

kugano
12-18-2002, 07:22 PM
While I admit the DNS solution isn't perfect, I'm not sure I'm ready to allow users to choose their country in their preferences either. The reason is that, while certainly no one here falls into this category, there are a lot of dishonest people out there who would choose a country other than their actual country. The by-country page would then become a popularity contest, and that's not what they're meant to be at all.

I'm going to look into the idea someone suggested earlier on the forum about using IANA/ARIN/RIPE/etc. data to determine country ... someone, somewhere, has to have come up with a database mapping IP ranges to countries.

Vato
12-18-2002, 07:51 PM
Any RPSL compliant inetnum object should have a country field containing the 2-letter ISO-3166 country code - but bear in mind that the object data may have been edited by error-prone humans (like me). Unfortunately, there's no guarantee that ARIN data will be as standard as RADB/RIPE/APNIC/CW etc

A query to whois with the flag to search any database will cover most bases, but would generate a LOT of lookups - you should then either cache answers, or probably run your own IRRd with live updates from RADB/RIPE etc. This may be more involvement than you're looking for.... :(

Using the RIPE whois client, the following pipeline would give
you approx what you're after:

whois -a -r -T inetnum <ipaddress> | egrep '^country:'

Further pointers available upon request.

kugano
12-18-2002, 11:05 PM
Excellent! I'm looking into automating this now. Is there a way, as far as you know, to restrict the whois results to authoritative entries only? Non-authoritative sources return (incorrect) country codes, and at the moment I can't think of any way to algorithmically determine which of the several entries is authoritative (a human can do it by reading the comments, but Perl ... ?)

outlndersob
12-19-2002, 01:38 AM
I would like to see the individual graphs being correct.

I have accessed the site multiple times and see my production plummeting on the graph. I then run to my machines and find out nothing is wrong and they are all running full bore.

Why do the graphs show incorrect info??


I'd also like to see the individual pages updated more than once a day.

Vato
12-19-2002, 01:47 AM
To choose which of several inetnums, choose the most specific one - ie. the one with the longest prefix eg. a /29 (8 address block) wins over a /24 (256 address block - a class C network). In simplistic terms, an easy algorithm to calculate the length of the prefix would be to subtract the first IP address from the second IP address whilst treating both as 32 bit unsigned values (use inet_aton for this), and add 1 to the result. The result should be a power of 2, each representing a further increase/decrease in the prefix length by one bit. Any result which is not a power of 2 will be an invalid entry. In these terms, you would choose the inetnum with the smallest result as your preferred value.

If you search on google for iso3166, you'll find things like http://nl.ijs.si/gnusl/cee/std/ISO_3166.html. The column of 2 letter codes is the one we care about. With one exception that I know of, if a value from the whois lookup is not a case-insensitive match for an item from this column - it is invalid and should be ignored. The one exception is UK, which is in use as a country code and as a TLD, although the 'correct' value should have been GB. Treat UK as a synonym for GB and you'll get the right answers.

Hope this is useful.

smh
12-23-2002, 04:02 PM
There seem to be some inconsistentcies with the uper bound number that has been tested, and the lowest number under test for a given K.

is this because some tests were manually added?

kugano
12-23-2002, 11:51 PM
It's because of dropped (or, actually, "late") tests: there may still be tests out for lower n values which have already been returned by other users. So the upper bound climbs while the "min n under test" remains low, since the test is still out to some user (even though another user already reported it).

Supp
12-25-2002, 12:51 PM
Originally posted by kugano
While I admit the DNS solution isn't perfect, I'm not sure I'm ready to allow users to choose their country in their preferences either. The reason is that, while certainly no one here falls into this category, there are a lot of dishonest people out there who would choose a country other than their actual country. The by-country page would then become a popularity contest, and that's not what they're meant to be at all....

Well, I think that most people are proud of country they live in & crunch for so "popularity contest" is very unlikely and you wouldn't have to dig through all that WHOIS stuff.

Just and opinion, though.

eatmadustch
12-25-2002, 01:03 PM
I don't know. I think a lot of people wouldn't take it seriously and enter stuff like andorra, lichtenstein, vatican city (all extremely small countries) and other "silly stuff" like that! Or, a lot of "anti-americans" would enjoy entering iraq and afganistan!

shifted
12-25-2002, 08:38 PM
Originally posted by eatmadustch
I don't know. I think a lot of people wouldn't take it seriously and enter stuff like andorra, lichtenstein, vatican city (all extremely small countries) and other "silly stuff" like that! Or, a lot of "anti-americans" would enjoy entering iraq and afganistan!

Very good point. I'd stick with the whois stuff as it's reasonably accurate now and already implemented.

Cowering
01-01-2003, 10:51 PM
I currently have this in the stats display for my team:

40 DeathAxe 1000.00 M cEMs 8.32 K cEMs/sec 7.03 K cEMs/sec

1000.00 M is 1.00 G of course :)

Just thought I'd mention it

Cowering
01-02-2003, 10:22 AM
I was happily surprised that TeamRetro jumped from #5 to #2 on 'last days rate' (and got the blue up arrow)... I was even more surprised when I looked at today's update and found the nice blue arrow still there, but we are still at #2 :) Probably something simple to fix. as I'm sure TeamPrimeRib would not want to get caught TOOO soon

EggmanEEA
01-04-2003, 08:31 AM
Originally posted by dmbrubac
Well I appear to be the other 15%, which is a bit disappointing. Are there realy only 2 Canadians working on this.



Between shovelling snow, sharpening my skates, and eating back bacon, I'm also a Canadian and judging by the current stats, I believe I will have you and shifted begging for mercy in a few weeks time, I just need to catch up... ;)

:notworthy

EggmanEEA

dmbrubac
01-04-2003, 10:38 AM
That post is from before the country stats where changed. I believe Louie had lumped all the .com, .org, etc into the US stats. They took a big shift a couple weeks ago and now it is clear I am NOT one of only two Canucks.

I feel much better now.....

MikeH
01-10-2003, 11:58 AM
Originally posted by alpha
I second this. I always submit work from a .com host but am located in the UK.

I also agree it would be so much better to use the user profile information to obtain country stats. Interestingly this info is already in the user public profile, problem is it's a freeflow text field rather than a selection from a list.

I also live in the UK but will always be resolved as .com (or un-resolved depending upon where I am). Never .uk.

Mystwalker
01-10-2003, 12:25 PM
As it's said on the accordant page:

"They are based on the two-letter country code of the IANA IP address registry of the address from which the work was submitted and could be slightly off."

Some countries indeed seem to do more than they are credited on that page, but I'm not sure if it's better to let the members decide, as everybody would have to do it.

Though an advantage would be that you can work abroad and still the country you're from get's credited.

Maybe a solution already mentioned is best:
Take IANA, unless there's a country entered in the user profile.

Scarblac
01-10-2003, 02:05 PM
I posted a reply here saying I didn't like .com being the US, but I see now that this has been fixed already, ignore.

jjjjL
01-11-2003, 07:31 AM
The IANA IP lookup is far more accurate than whois.

For example, here is MikeH:

62.190.xxx.xx uk (ip address censored to protect the innocent)


I think this is better than asking users which country they are from. People will just list phoney countries like "Togo" and exotic ones when given a list of all of them. I've run other DC projects and I'VE DONE THAT. I think all of us have that hotmail account somewhere where we're "Julio Jamal" from the Easter Islands.

And some users submit from multiple countries (such as Payam, from Iran and Pakistan). So in some cases, even if a "real" country is listed in his profile, it will still make the stats less correct.


-Louie

MikeH
01-11-2003, 08:33 AM
Being "the inocent" here, can we just clarify which method is being used to generate the stats right now?

This 62.190.xxx.xx address with a simple hostname resolution will result in a .com. So right now is this contributing to the US or UK stats.

And thinking about it, I do agree that allowing the user to select is not so good. If only because users select one that no one else has, then, from their perspective it becomes they vs the world.

So yes, please go with (or stick with) IANA IP lookup if it gives good results. Which brings me back to the question - which method is being used right now?

Mystwalker
01-11-2003, 10:28 AM
Do quote myself: ;)


As it's said on the accordant page:
"They are based on the two-letter country code of the IANA IP address registry of the address from which the work was submitted and could be slightly off."

annexe: This (http://www.seventeenorbust.com/stats/byCountry.mhtml) page is meant. And it's on the bottom.

MikeH
01-11-2003, 10:37 AM
My humble apologies. I had read the "Take IANA, unless there's a country entered in the user profile" argument as if it was still being proposed. Missed the bit at the bottom of the country stats page.

All clear now. Sorry.

Mystwalker
01-11-2003, 10:40 AM
No apologies.
The bottom of a long page is rarely looked at. That's why it's the perfect place for the most unpleasant parts of the page. :D

Nuri
01-12-2003, 05:24 PM
Well, it seems that this IANA IP Lookup thing didn't work well for me. :(

I think I am the only participant from Turkey and country stats were almost perfectly correlated with my user stats until a week or so ago, and Turkey's 24-hr rate suddenly dropped to 0 and never came back. :eek:

This is my user stats. (http://www.seventeenorbust.com/stats/users/user.mhtml?userID=1577) I'm showing the link because my user id is different from Nuri.

I guess this is due to the change from whois to IANA. I guess this is bad luck on my side.

I agree that allowing the user to select is not good and an IP lookup method should give better reults. So, since you agree that IANA lookup is far better, it seems there is nothing to do unless Louie manually specifies my IP adress to belong to Turkey.

Anyway, just wanted to share that with you.

Scarblac
01-13-2003, 07:19 AM
Another data point to show that the current country stats don't work: I started a bit over a month ago. The first time I saw the country stats, the Netherlands were in #5, just like they're now.

About 10 days ago, I started asking around, and some friends joined my team, PINO. They're all from the Netherlands. Together we do 1M+ cEM/s daily rate. The Netherlands are still at #5, and in fact the whole country's daily rate is now below our team's every day! Even if there are no Dutch here apart from us (and there must be quite a few more), the stats are still too low.

Actually, the system's daily rate is at about 95 McEM/s, whereas all the daily rates in the country stats added together gives about 49 McEM/s...

I'd much rather see a few fake Afghanistan entries than the current situation, I don't think a significant amount of people would fill in another country.

shauge
02-11-2003, 01:17 PM
I second Scarblac.

I submit from the one major provider in Norway at work and from another big provider at home and none are registered to my country.

We are participating in this project for a good reason. Some friendly contest between countries would just be fun.

hc_grove
02-12-2003, 11:15 AM
[QUOTE]Originally posted by kugano
[B]A few things:

1) I've already implemented one of the ideas I mentioned in my first post. A breakdown of work, rate, and so on by country is now available at this URL:

http://www.seventeenorbust.com/stats/byCountry.mhtml

And those suck!

Apperently I've done 118% of the work done i Denmark.

(And or top producer biff has done 1039% of the work done in Sweden? - At least his profile says he's swedish)

The only thing I can see that would explain this, is that most of the computers I'm using are behind a firewall and have
private IP-addresses (10.2.1.*). But the firewall have an public IP (which I'll gladly give you in private mail) which have a nice reverse DNS-looup ending in .dk.
:bang:

A feature I would like is an number telling me how much of the team (and country's - if those stats can be made meaningfull)'s work I've done.

.Henrik

RangerX
02-12-2003, 09:03 PM
I'm sort of late in posting this (I haven't been keeping up with forums since I stopped sieving) but I'd pay a good 25-30 $US (more than that and I'd have to start negotiating with my money-saving side) for one of those posters! I've got a nice blank spot on my wall that I think it would fit into quite nicely...

MikeH
02-13-2003, 03:15 AM
If I understand correctly how SB country determination works, if you go here (http://www.maxmind.com/app/lookup) and enter your IP addresses, this should indicate the country that will be used for the stats.

For people like Nuri, Scarblac and shauge, if this shows your country correctly, but stats are wrong then there is an SB problem that needs sorting.

Scarblac
02-13-2003, 03:43 AM
Originally posted by MikeH
If I understand correctly how SB country determination works, if you go here (http://www.maxmind.com/app/lookup) and enter your IP addresses, this should indicate the country that will be used for the stats.

For people like Nuri, Scarblac and shauge, if this shows your country correctly, but stats are wrong then there is an SB problem that needs sorting.

This works correctly for me. It also works for the other people on my team (PINO) that I've tried (all from the Netherlands - and almost all either from the University of Groningen, or the @Home cable provider, and those IPs all work correctly there). Still, my team did 1.73McEM/s in the last 24 hours, almost double the total production of the country the last day (943.69 KcEM/s). Last time I already noticed that the stats on the country page don't add up to the total project stats at all.

I have noticed other small problems with the stats, sometimes they just don't add up. Let me look...

Say, team stats. Total production of PINO is now 5.04TcEM in 67.7 days, or an average of 5040000000/(67.7*24*3600) ~= 862 KcEM/s, close to what the overall rate table says (863.33).

For team d'family, it says 4.48T in 62.7 days (average about 827), but the overall rate column says 679.04 - not close.

I think I've noticed other things but I can't think of them at the moment.

Nuri
02-13-2003, 11:07 AM
It's correct for me too. (195.174.xxx.xxx is Turkey)

I don't really check for country stats with much frequency since I discovered that it was wrong. So, if anybody checks it frequently, can you please confirm if it is updating itself at all? AFAIK, the 37 countries figure did not change since the country stats were first implemented. Is it really likely that even just one person from any one of the remaining ~170 countries did not join the project within that period?


As of now, total SB production is 593.34T cEMs. The total of operating systems is something like 585T cEMs (99% of total), which is pretty close, and acceptable. On the other hand, the total of 37 countries is 326.12T cEMs (55% of total). So, where did the remaining 45% come from? :confused:


May be we should contact SETI researchers. It might be a kind of extraterrestrial intelligence trying to contact us, and perhaps we've found ETI before everybody else, as a byproduct of our project. :D :thumbs:


Ok, seriously, I personally prefer not to have any county stats at all if even the total stats is 45% off. To my view, there is really no need to mention individual countries at all if the total is that misleading. As a last note, I really don't care much about the country stats. But, what troubles me more is about if the obvious error in country stats is harming the overall confidence in the project itself.

By the way, Remaining tests n < 3000000 and Remaining tests n > 3000000 stats on the overall stats page is cool.

Regards,

Nuri

hc_grove
02-13-2003, 12:17 PM
I don't really check for country stats with much frequency since I discovered that it was wrong. [b]So, if anybody checks it frequently, can you please confirm if it is updating itself at all?


It does change.



As of now, total SB production is 593.34T cEMs. The total of operating systems is something like 585T cEMs (99% of total), which is pretty close, and acceptable. On the other hand, the total of 37 countries is 326.12T cEMs (55% of total). So, where did the remaining 45% come from? :confused:


It might be caused by failing reverse DNS-lookups. There are two things that might cause this:

1. Unfortunenately not all IP's in use have proper reverse DNS. :swear:

2. As I wrote yesterday, my work appears not to be counted (at least not for the country), probably because the computers I use, have private IP-adresses and the stats are based on IP's reported by the client. - Try performing a reverse DNS-lookup for 10.2.1.129 (the IP of the computer I'm using right now), 192.168.1.42 (my desktop computer at home or 172.28.10.67 (my laptop when I'm at home).



Ok, seriously, I personally prefer not to have any county stats at all if even the total stats is 45% off.


I agree.

.Henrik

shauge
02-13-2003, 01:08 PM
I just looked up my home ip-address, it got to the correct country. I have steadily produced more than 300K/s at home, that is more than my total country rate.

I will check my work ip-address tomorrow.

It looks as this issue may be possible to fix then. Good.

MikeH
02-13-2003, 01:32 PM
shauge wrote
2. As I wrote yesterday, my work appears not to be counted (at least not for the country), probably because the computers I use, have private IP-adresses and the stats are based on IP's reported by the client. - Try performing a reverse DNS-lookup for 10.2.1.129 (the IP of the computer I'm using right now), 192.168.1.42 (my desktop computer at home or 172.28.10.67 (my laptop when I'm at home).
I think these address you quote are just local LAN addresses, they are not the address by which you'll be know to SB. If you follow this (http://www.maxmind.com/app/lookup) link, you'll see your external IP address and the country (even if it can't be resolved).

hc_grove
02-13-2003, 04:33 PM
Originally posted by MikeH
I think these address you quote are just local LAN addresses, they are not the address by which you'll be know to SB.


Why do think I have been calling them private? I know they are local to the LAN's. That doesn't change the fact that they are probably the only adresses the client knows.

And since there are perfectly fine reverse DNS for the public addresses I use. Relying on IP's from the client is the most probable explaination for my work, not being included in the numbers for Denmark.


If you follow this (http://www.maxmind.com/app/lookup) link, you'll see your external IP address and the country (even if it can't be resolved).

Besides being a graduate students in mathematics, I'm also working as a systems & network administrator, so I know how to find my public IP address.

.Henrik

MikeH
02-13-2003, 04:50 PM
hc_grove, no offence intended.

What I was really trying to say is that it's unlikely that SB are recording your private IP address. Infact I know they're not from a previous posting with regard to my IP address. As such, this should not be the cause of the reports of country stats being wrong.

hc_grove
02-13-2003, 05:02 PM
Originally posted by MikeH
hc_grove, no offence intended.


I didn't think so, so none taken.



What I was really trying to say is that it's unlikely that SB are recording your private IP address. Infact I know they're not from a previous posting with regard to my IP address. As such, this should not be the cause of the reports of country stats being wrong.

Then I'll be looking forward to an explanation from someone who knows the system.

.Henrik

MDFaunce
02-13-2003, 10:11 PM
They aren't using reverse DNS, they are looking at the country of the owner of the IP address. This is not necessarily the same as the person who has been assigned the IP address.

From my memory, SoB started off doing rDNS, but there isn't a good way to get country data from that (where is .com? or .net? or .edu?). So, SoB switched to using the IANA IP address assignments.

The owner of an IP can be researched by doing a WHOIS against the following websites:

ARIN (www.arin.net) does the Americas & some of Africa.
APNIC (www.apic.net) does the Asia/Pacific areas.
LACNIC (lacnic.net/en/index.html) does Latin America & some of the Caribean.
RIPE NCC (www.ripe.net) does Europe, the Middle East, Central Asia & some of Africa.

Some of these will also point you to sub registrars if they've handed off blocks to others. Some ISPs will register IP assignments with the appropriate registrar but, in my esperience many don't. In that case the IP is shown as being where the ISP HQ is. I usually start with www.arin.net, but that's because I'm in the US.

For example, my IP at home is 66.23.192.250 (Go ahead, blast away, I'm pretty sure I'm safe, but if you do find an obvious hole, be gentle :bang: ) If you do a whois at www.arin.net, you'll find that this IP is owned by Speed Factory (my ISP). It's "mine" in that it's been staticly assigned to my router by my ISP, but my ISP owns it and hasn't taken the time to register it to me with ARIN (not that I really expected them to or care if they do). In my case the ISP just happens to be in the same city, state and country as I am (their HQ is actually only about 2 miles from my house, one of the reasons I selected them). But that doesn't have to be the case. I know that my office's IP will be listed as being in NC and that the particular IP we have is listed as being assigned from them to another ISP who is listed as being in VA. My office is in GA. So, if SoB were to use the same routine they are using now, but extend it down to the state level, the units I completed at work would be credited to either NC or VA, but not GA.

In doing research on tracking down spammers, I've run across many cases in Asia and Europe that the ISP is in one country but the IP is bing used in a different country. I think this is more prevalent in Europe and Asia where the ISPs are multi national.

As usual, this is only my understanding of the way things work, so take it with a grain of salt.

shauge
02-14-2003, 03:13 AM
I have now checked my IP addresses both at home and at work using both of the above methods, in all cases the ip address is linked with the correct country.

Maybe Henrik is right. Maybe it is the ip address that the client think it is having that is used. Then everybody who is behind a router will not get the correct country, also all who has a combined DSL modem/router at home.

I think we deserve an explaination.

Scarblac
02-14-2003, 03:20 AM
Originally posted by shauge
I have now checked my IP addresses both at home and at work using both of the above methods, in all cases the ip address is linked with the correct country.

Maybe Henrik is right. Maybe it is the ip address that the client think it is having that is used. Then everybody who is behind a router will not get the correct country, also all who has a combined DSL modem/router at home.

I think we deserve an explaination.

I don't think that's the reason, most people in my team are on our uni's network (129.125.*) or have cable modem, and they all have a normal IP address. I suspect there's just some small bug in the stats code somewhere.

hc_grove
02-14-2003, 03:54 AM
Originally posted by MDFaunce
[B]They aren't using reverse DNS, they are looking at the country of the owner of the IP address.


No. That's also correct for the public addresses I use.


.Henrik

eatmadustch
02-14-2003, 10:09 AM
a neat little tool (for linux and windows) is the cyberabuse whois (www.cyberabuse.org). I use it mainly for tracking down spammers. The neat thing about it is it only shows relevant information and it searches for the WHOIS server itsself. So it will show which country the spammer (or sob user or whatever) is from, it will show the abuse e-mail (useful if it's a spammer) and it will show other relevant information too. for example, MDFaunce (se his post earlier with his IP) is in the US, and if he spams you, you've got to complain to darryl@speedfactory.net :)

hc_grove
02-25-2003, 05:32 AM
Originally posted by Nuri
I don't really check for country stats with much frequency since I discovered that it was wrong. So, if anybody checks it frequently, can you please confirm if it is updating itself at all? AFAIK, the 37 countries figure did not change since the country stats were first implemented.


I just saw that the list now includes 38 countries. (But I don't know the list well enough to tell which country is the new one.) That shows that new IPs are in fact resolved.



As of now, total SB production is 593.34T cEMs.
The total of operating systems is something like 585T cEMs (99% of total), which is pretty close, and acceptable. On the other hand, the total of 37 countries is 326.12T cEMs (55% of total).


Now the total production is up to 736.388T cEMs, the total of the 38 countries is around 360T cEMs. That 48% of the total so that stats have gotten even worse!

That means that only 28% of the work done in the last weeks have been attributed to a country.

Fix those country stats or remove them. In the present state they are worthless (at best).

.Henrik

FoBoT
02-25-2003, 10:55 AM
Originally posted by kugano
4) Take-over times for user and team rankings, i.e. an indicator of "how long it will be at your current rate to overtake the person ahead of you in the rankings."


this is something i really like in some other DC projects, it would be especially helpful for SoB due to the units the stats are measured in, i am not too smart and don't really understand what the stats mean, so if you made it plain "FoBoT will pass XYZ in 3 days and 4 hours" that would be super cool :cool:

jjjjL
02-25-2003, 02:33 PM
IPs are pulled off the first packet in the login transmission by the server.

Countries are pulled from IANA listings only. If IANA has no country associated with a record, it can't be used.

-Louie

hc_grove
02-25-2003, 07:04 PM
Originally posted by jjjjL
IPs are pulled off the first packet in the login transmission by the server.

Countries are pulled from IANA listings only. If IANA has no country associated with a record, it can't be used.

-Louie

Would you mind explaining what you mean by IANA listings? IANA doesn't hold much data themselves, but leave it to the RIR's.

Of course, that would explain why 72% of the work being done at the moment isn't counted.

.Henrik

MAD-ness
02-27-2003, 02:18 AM
Fobot: the data has not been collecting for long enough for truely accurate long-term predictions, but try this link:

http://marcc.no-ip.org/SoB/User.php?User_ID=360

MarcC from the Ars Technica DC teams has been working to port his ECC2 stats over to SoB.

Hopefully that will atleast alleviate a bit of your stats problems Fobot. :)

Mystwalker
02-27-2003, 04:18 AM
That looks really awesome! :thumbs:

Can't wait to see it being integrated. :)

shauge
02-27-2003, 03:15 PM
I really like this stats.

Impressive! :thumbs:

Cmarc
02-27-2003, 04:34 PM
Thanks for the kind comments.
I still have a couple of small problem at the moment:

-How is rank calculated for the official stats? I can't seem to get a match for all the users with no work.

- How do I get the Total and average Work values for team members? a person's work does not follow him from one team to another but to accurently calculate totals I would need more history than I currently have. Is there another place I could find this information (without parsing all team pages every hour) ?

Thanks,
Marc

hc_grove
02-27-2003, 05:26 PM
[QUOTE]Originally posted by Cmarc
Thanks for the kind comments.
I still have a couple of small problem at the moment:

-How is rank calculated for the official stats? I can't seem to get a match for all the users with no work.


They seems to be alphabetically sorted.

It's really nice.

.Henrik

jjjjL
02-28-2003, 12:41 AM
Originally posted by Cmarc

Is there another place I could find this information (without parsing all team pages every hour) ?


Yes! Read this http://www.seventeenorbust.com/help/textStats.mhtml

I'm sure Mike would much rather have you leeching just the text dump of the database info than parsing every page. :)

-Louie

Cmarc
02-28-2003, 12:57 AM
Thanks Louie. I already use this page for my stats (and I must say it is the best stats dump I've seen so far. certainly the only one that comes with detailed explanations).

leeching all pages is, of course, not an option. my problem is that the values for users are totals and I have no way of determining what portion of this total was produced while the user was a member of his current team. I can reflect team changes that happened since I started collecting data (about 3 weeks ago) but not what happened before.

Marc

Lagardo
02-28-2003, 06:18 PM
FWIW: Your (Cmarc's) stats have an odd update pattern: Once an hour, either on h:20 or on h:33 in turn. Is there a reason for that?

Cmarc
03-01-2003, 04:44 AM
Yes that is a bit strange. I pull the stats from the page listed above every hour at h:33. From the start I've seen this strange pattern. changing the update time does not help much as far as I can tell. If I move the update time the choices become :20 and :00. Given the large number of updates and the fact that there are only two different period lengths the variance does not matter much in the long run, just a minor curiosity.

Troodon
03-09-2003, 10:12 PM
Since I'm in this project, the stats of my daily production aren't accurate. For example, today my computer has done (and reported) 12 blocks. That's 3 GcEMs (250 McEMs per block). But my stats say 766.99 McEMs.
The total production is measured accurately. Anyone else is having this little problem?

eatmadustch
03-10-2003, 11:52 AM
I'm having similar problems
I've opened a thread: http://www.free-dc.org/forum/showthread.php?s=&threadid=2726

cmprince
04-15-2003, 04:35 PM
Sorry to resurrect a dead thread, but this seemed better than starting a new one...

Since the remaining tests n < 3M will probably take a while (that means you, stragglers :p), it might be more informative to bump the Remaining Test Boundary up to 4M.

Just a thought...

Mystwalker
04-16-2003, 04:38 PM
As we're at ~3,475,000 at the moment, this would mean that #(tests left < 20M) = #(tests left < 4M)... ;)