08-29-2003, 05:48 PM

.plans of decibel (http://n0cgi.distributed.net/cgi/dnet-finger.cgi?user=decibel):

:: 29-Aug-2003 11:45 CDT (Friday) ::
As I feared, the Sybase stats database is corrupt. I have copies of all
participant data (team membership, retires, team and participant settings) as
of ~8/29/03 0:00 UTC. If I have to restore using this data, any changes people
made last night will be lost. I'm still trying to find a way to at least read
from the corrupt database, since only one table is corrupt. If I can do this,
no data will be lost during the restore.

Unfortunately, no matter which route I take, stats will be down for most of
today at a minimum.

More info as available.

:: 29-Aug-2003 02:40 CDT (Friday) ::
RC5 stats are still wrong, but I've turned access back on. Hopefully I'll have
it fixed tomorrow.

:: 28-Aug-2003 21:41 CDT (Thursday) ::
The problems with stats are more serious than I thought. The table that stores
how much work each email address did on each day no longer matches the other
tables. I know that sounds rather serious, but that information can always
be re-created from the log files if it comes to that. I'm in the process of
loading in a backup copy of that table; hopefully it will allow me to fix this
with a minimum of downtime.

More info when available...

:: 28-Aug-2003 20:00 CDT (Thursday) ::
The RC5-72 statsrun bombed; I'm going to have to re-load today's data, so
expect stats to be a bit late.

My stats-system (http://stats.the-mk.dyndns.org/) was able to fetch some data, but nobody knows if they are correct...

I hope they'll repair their database as soon as possible

08-30-2003, 02:49 AM
.plans of decibel (http://n0cgi.distributed.net/cgi/dnet-finger.cgi?user=decibel)

:: 30-Aug-2003 00:03 CDT (Saturday) ::
I just talked with MattR; he's not going to be able to recover the existing
database. We'll be restoring from a backup as soon as it's done bunzipping.

:: 29-Aug-2003 23:58 CDT (Friday) ::
MattR was able to get the database online again, so I now have copies of the 3
tables. This means that no data should end up lost out of all of this.

There are 3 options we have right now. First, we can attempt to repair the
existing stats database. MattR's attempting this right now. The second option
is to drop the existing database, restore from the July 12th backup, and bring
in the updated information. The third option is for us to cut-over to stats
running on PostgreSQL, which is 95% done right now.

I'm playing it a bit by ear before deciding which way to go. Going to
PostgreSQL is very tempting, since we'll need to do it in the near future
anyway, but I don't like the idea of going to production when it's not complete
and hasn't been beta-tested.

Whatever happens, stats definitely won't be up until tomorrow afternoon at the
very earliest.

:: 29-Aug-2003 20:09 CDT (Friday) ::
Blower is up and running again; apparently it died because the raid controller
freaked out after a disk failure. I'm still hoping to recover any data that was
modified last night, but if that doesn't happen soon I'll just go with what we

:: 29-Aug-2003 15:59 CDT (Friday) ::
Looks like all the trouble is being caused by hardware issues. Moose is working
on getting blower up and running again, at which point I'll have a better idea
what's left to be done.

08-30-2003, 07:17 PM
:: 30-Aug-2003 13:21 CDT (Saturday) ::
Quick update...

Sybase is up and running again, but with month-old data. Sybase's BCP routine
doesn't handle embedded delimiter characters very well, so I'm a bit uneasy
about trusting the data pulled out of the corrupted database, especially for
loading in a month of changes impacting 2600 participants.

Because of this, I'm concentrating on getting the data loaded into PostgreSQL,
since there should be very few changes (the data is PostgreSQL is only a few
hours older than when I shut down stats).

In any case, progress is being made and we're much closer to working stats now
than we were yesterday.

08-30-2003, 08:02 PM
Sounds ugly alright. :bang:
Thanks for keeping us posted! :D

08-31-2003, 04:45 AM
:: 30-Aug-2003 18:04 CDT (Saturday) ::
The PostgreSQL copy of stats is up-to-date now, although it does need to
process August. The HTML side still needs to be setup, which paul will work on
in 10 hours or so (it's midnight his time right now).

Unfortunately this means another day without stats, but I'm really
uncomfortable with trying to get data back into Sybase properly. There might
be some information that has been lost going into PostgreSQL, but it should
only be some changes that were made in a window of about 8 hours. The Sybase
database is currently 6 weeks out of date, so there's much more room for

While paul is getting the HTML stuff setup there might be periods where you can
not get to any stats web pages at all. Don't worry, this is just him working on

I want to thank everyone for their patience; the end is in sight.

I just post it here IB, because they write that in very small fonts and I copy and paste it here that I can read it more easily :D :neener: and to inform you :thumbs:

08-31-2003, 05:17 AM
09-02-2003, 03:02 PM
as of decibels .plan (http://n0cgi.distributed.net/cgi/dnet-finger.cgi?user=decibel):

:: 02-Sep-2003 09:46 CDT (Tuesday) ::
Two issues people should be aware of:

There is a problem with apache processes sometimes consuming several hundred
megs of memory. We don't know what's causing this yet, although it seem to
only happen during certain parts of the daily statsrun. Because of this, I've
dropped the connection timeout to 30 seconds. This could affect users who are
on dial-up.

Database connection errors are not being properly trapped. When PHP can't
connect to the database, you'll get incorrect output, such as 'team does not
exist'. Give it a few minutes and try again; the data is there.

:: 01-Sep-2003 18:31 CDT (Monday) ::
I forgot to mention that stats are working through a backlog right now; should
be caught up in a day or two.

:: 01-Sep-2003 18:26 CDT (Monday) ::
Stats are finally back again, now powered by PostgreSQL, though there's still
work to be done.

All code that would modify the database is currently disabled, so you can't
edit your info, join teams, etc.

There are still some other minor bugs. If you find any bugs, please enter
them into bugzilla, but PLEASE check to make sure someone else hasn't already
entered it!

:: 31-Aug-2003 22:39 CDT (Sunday) ::
Paul's done a lot of work today, and except for a few bugs I think we're ready
to roll out mostly functional stats. I'd do it tonight but I want to make sure
he's around in case of problems.

All of the lookup functionality exists; what is missing is participant editing.
We should be able to get that done fairly soon though.

Looks like stats are online, but they show at the moment only stats of 2003-08-14... I'll wait until the stats are fully and actually online so I can take my dataspider online.... I hope they didn't change their sites code, else I'd need to rewrite my dataspider :rolleyes:

09-03-2003, 02:36 AM
I thought OGR-24 was DONE?!?!?!

THE-MK - get them stats back online, or .tar it up and send it my way.

They are WAY to nice to let go! :thumbs:

09-03-2003, 11:53 AM
OK, I'll put that nice stats on my server again... give me some time

I hope that I'll get it until tomorrow... Let's have a look if their new stats are combatible with my good old stats-system :rolleyes: (I hope so...) Did you notice that they changed their database from Sybase to PostgreSQL?

You know that the official "actual" stats are from 17-Aug-2003 on OGR and 14-Aug-2003 on RC5-72?

My system is serving data until 28-Aug-2003, until the time their server broke...

I'll start trying it at about midnight GMT+01

and their newest .plan (http://n0cgi.distributed.net/cgi/dnet-finger.cgi?user=decibel):

:: 02-Sep-2003 17:02 CDT (Tuesday) ::
Good news; I've found and corrected the problem with the apache processes
consuming several hundred meg, which means not only can I bump the timeout
setting back up, but that you're much less likely to run into database
connection errors (which are still masquerading as other errors).

:: 02-Sep-2003 13:58 CDT (Tuesday) ::
I'm doing some quick maintenance which means some pages won't be working for
the time being. Should hopefully take only 10 minutes or so.

09-03-2003, 12:16 PM
I gave it a try... copied the dataspider-scripts on my new server and...

tada... doesn't work... :( :swear:

The site-code changed at stats.distributed.net and I need to rewrite the dataspider. As I said: I can start that at about midnight GMT +1... be patient :D

09-03-2003, 06:42 PM
1. never accept promises from me
2. I'm too tired
3. I'll try to work on my d.net-stats tomorrow (but first see point 1 :D)
4. sorry folks
5. be patient, I'll bring them back online (and see first point :D)
6. official stats don't provide actual stats, so it would be useless hurrying up rewriting my dataspider (latest at the moment at stats.distributed.net is 19-Aug-2003, so there are about two weeks missing) :spank:
7. I'll start with a fresh database

09-04-2003, 06:49 PM
Guess who's back! Back again! My d.net-stats are back! Tell some friend! :D

I managed to rewrite my dataspider to fit their new html-output. And I started a fresh database, give it some time to feed it with some days of stats to have nice last-day/week-numbers :Pokes:

Some new linkage: http://the-mk.dyndns.org/d-net/


and the newest .plans (http://n0cgi.distributed.net/cgi/dnet-finger.cgi?user=decibel)

:: 03-Sep-2003 11:18 CDT (Wednesday) ::
I'm trying to fix the kernel shared memory parameters so we can run more
PostgreSQL processes. Stats should be back in 10 minutes or so.

But their stats are still NOT UP TO DATE!! So mine are not up to date :weggy:

09-04-2003, 08:12 PM
:thumbs: Good to hear!!! :D

09-05-2003, 12:11 AM
Don't spend too much time fixing your data spider, it's likely we'll have any number of HTML changes to fix various bugs as we come across them. We are working on getting some XML export functions put together to allow you to pull stats more easily.

09-05-2003, 12:51 AM
09-07-2003, 07:05 AM
decibels .plan (http://n0cgi.distributed.net/cgi/dnet-finger.cgi?user=decibel):

:: 07-Sep-2003 02:50 CDT (Sunday) ::
I'm restoring a backup of the stats database; stats will be un-available until
this is done. Sorry for the inconvenience.

09-07-2003, 01:52 PM
decibels .plan (http://n0cgi.distributed.net/cgi/dnet-finger.cgi?user=decibel):

:: 07-Sep-2003 11:02 CDT (Sunday) ::
I'm in the process of restoring an older copy of the stats database; I'm not
sure how long it will take, though.

09-07-2003, 09:07 PM
Originally posted by the-mk
decibels .plan (http://n0cgi.distributed.net/cgi/dnet-finger.cgi?user=decibel):

me sets proxy to hang on to stubs/blocks till this stats fiasco we have been enduring for 2 weeks now is over with..:(

And on a side note;

We'd also like to ask everyone to check their machines with older
clients and be sure a version higher than v2.8014 is being run
(v2.8015 was released around May 2001). Although OGR results from
these older clients are currently still being accepted, they have
diminished mathematical value to us. Additionally, a large majority
of traffic that we are continuing to get from those old versions
represent unauthorized worm deployments of our client. Because of
these reasons, in the next couple of weeks we may begin blocking
traffic from these older versions and instructing them to shut down.
If you still have machines running one of these older clients when
this is done, it may cease running and contributing to your stats

this is the end of the BeOS clients participation then as the latest client was 2.8010


09-08-2003, 10:08 PM
Well not sure if it is an improvement, but the stats are back up.......July 14th (http://stats.distributed.net/team/tmember.php?project_id=24&team=910183941&source=y) :rolleyes:

09-09-2003, 01:14 AM
decibels .plan (http://n0cgi.distributed.net/cgi/dnet-finger.cgi?user=decibel)

:: 08-Sep-2003 18:49 CDT (Monday) ::
Well, stats are finally back up. Blower is re-running logs starting from July
13th and it should be caught up in a few days. There's currently an issue with
participant history being very slow, but I'm hoping that updating some index
statistics will solve that problem.

OK, seems to be up again...

Started a fresh database again...

I hope they wont find any new errors :rolleyes: