PDA

View Full Version : Discussion: Handling Non-Unique Usernames



Dyyryath
03-09-2004, 12:14 PM
Before I can finish building a project agnostic schema for the unified stats system, we need to hammer out a good way of handling projects that don't require unique usernames and/or userids.

One of the great things about Distributed Folding (from a 3rd party stats point of view) is the way the project leaders generate unique, non-changing numeric ids for each user. This makes tracking a user from team to team and across name changes painless and accurate. Unfortunately, not all projects provide this feature (despite it's simplicity). This causes problems when:

More than one user has the same name on the same team (especially if they are close in rank)
Users change their name (especially if they change teams at the same time)
Users move from team to team (which can have different effects if the project has portable WUs or not)

What I'm looking for here are ideas on how to handle those projects that don't have unique IDs or a requirement that all usernames are unique. In the past I've used a variety of strange hacks to make this work on projects without unique identifiers of some sort, but I've never really been pleased with any of them. Has anyone here come up with a really solid way of doing this?

Additionally, I'd like to see us put together a matrix of each project and what it's rules are for:

User IDs. Does it provide them? Are they static?
Usernames. Do they have to be unique? Is there a limit to the characters that can be used?
ID Change. Can users change their name?
Team IDs. Does it provide them? Are they static?
Teamnames. Do they have to be unique? Is there a limit to the characters that can be used?
User Movement. Are users allowed to move from team to team?
WU Portability: When a user moves from team to team, do previously processed WUs go with them?

Here's an example using Distributed Folding:

Project Info Table (http://www.free-dc.org/project-stats-info.html)

If anyone (or multiple people) would like to send me the information for other projects, I'll gladly fill it in. It'd be a nice reference for the future and it would also make it easier to see what the 'problem child' projects will be. ;)

Darkness Productions
03-09-2004, 01:33 PM
The problem child of problem children. This thing seems like the red-headed step-child of a red-headed step-child, to put it mildly... Seti@Home.

From your table:
Project - Seti@Home
User ID - No
Username - Any (including special characters)
Name Change - Yes
Team ID - Alpha-Numeric
Teamname - Any (as with username, including special characters)
User Movement - Yes
WU Portability - Yes

Cmarc
03-09-2004, 03:39 PM
<tr>
<td>Seventeen or Bust</a></td>
<td>Yes</td>
<td>Non-Unique</td>
<td>Yes</td>
<td>Numeric</td>
<td>Any</td>
<td>Yes</td>
<td>No</td>
</tr>
<tr>
<td>Ecc2</a></td>
<td>Yes</td>
<td>Non-Unique</td>
<td>Yes</td>
<td>Numeric</td>
<td>Any</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr>
<td>Folding@Ho9me</a></td>
<td>No</td>
<td>Non-Unique</td>
<td>No</td> // Not sure about this one
<td>Numeric</td>
<td>Any</td>
<td>Yes</td>
<td>No</td>
</tr>
<tr>
<td>Muon</a></td>
<td>No</td>
<td>Non-Unique</td>
<td>Yes</td> // requires manual intervention from project manajer
<td>Alphanumeric</td>
<td>Any</td> //Team names are usually enclosed in square brackets
<td>Yes</td> // requires manual intervention from project manajer
<td>Yes</td>
</tr>

Dyyryath
03-10-2004, 11:10 AM
I've also added a column for how the information is presented. If it's just an HTML page that needs to be parsed, then HTML:address would probably be good. If it's a dump of some type designed for 3rd party stats guys then DUMP:address would work. If it's a CSV file then CSV:address would be appropriate...

Darkness Productions
03-10-2004, 01:12 PM
Ha. Seti@Home having as many users as they do, it'd be impossible to track everyone, so there really isn't a page with all the users, per se. But you can get the stats data for the top 200 teams from http://setiathome.ssl.berkeley.edu/stats/team/team_type_0.html, and then the corresponding teams from http://setiathome.ssl.berkeley.edu/fcgi-bin/fcgi?cmd=team_lookup&name=$teamname

Cmarc
03-10-2004, 05:13 PM
Ecc2, DUMP:http://ecc2.student.utwente.nl/cleanstats.php
17orBust, DUMP:http://www.seventeenorbust.com/stats/textStats.mhtml
FAH, DUMP:http://folding.stanford.edu/daily_team_summary.txt
FAH, DUMP:http://folding.stanford.edu/daily_user_summary.txt
Muon, DUMP:http://www.stephenbrooks.org/muon1/rawstats.txt

magnav0x
03-10-2004, 05:35 PM
In addition to the traditional muon1 dump:
DUMP:http://www.stephenbrooks.org/muon1/rawstats.txt
there is also the little known about:
DUMP:http://www.stephenbrooks.org/muon1/teamids.txt

Overall I'd say the general stats dump provided by DPAD is difficult to work with. I suppose I could centralize the data into a database and provide a dump that's easier to parse data from (to avoid uneeded calculations/data shift and what not by the stats engine).

Bok
03-11-2004, 09:44 AM
Project - Eon
User ID - No
Username - Any (at least some special chars are known to be allowed)
Name Change - No
Team ID - No
Teamname - Any (same as name)
User Movement - Yes
WU Portability - Yes

Project - MD5CRK
User ID - No
Username - alphanumerics plus @.-_
Name Change - No
Team ID - No
Teamname - alphanumerics plus @.-_
User Movement - Yes
WU Portability - Yes

For Eon there is a raw file

http://eon.chem.washington.edu/groups/stats_raw.php

Bok

magicfan241
03-12-2004, 03:15 PM
Project - GRID.ORG/UD!!!!!
User ID - No
Username - Any, but must be unique, due to the password system in use, has been used as the User ID in the past
Name Change - No
Team ID - No
Team name - Any, again, must be unique, due to password stuffs they use.
User Movement - Yes
WU Portability - NO

Bok
03-12-2004, 03:45 PM
Are you sure you meant Eon ^ magicFan241????? I've already done that one.

I think you didn't change the project name..

Bok :Pokes:

magicfan241
03-12-2004, 06:45 PM
DAMN IT!!!

I changed a few things, but the Edit didn't take effect.

Changing it now, it should be Grid.org/UD

DOH!!!

magicfan241

pfb
05-29-2004, 08:55 PM
Originally posted by Dyyryath
One of the great things about Distributed Folding (from a 3rd party stats point of view) is the way the project leaders generate unique, non-changing numeric ids for each user. This makes tracking a user from team to team and across name changes painless and accurate.

Not in my experience - quite a few 'None' users in the Teamless user group have the same UID and TID - this is very difficult to track those people...

An example:

http://wibble.bounceme.net/Sneakers/teamless.png

Note that in the top 20, 7 Users have the same UID of 0 - despite 5 of them being active on this protein...

This is the main reason I ignore the Teamless user group from all bar the protein total and ETC calculations - it's too difficult to monitor them as individuals or as team...

:bang: