PDA

View Full Version : Too much downtime



jasong
06-30-2005, 06:07 PM
Guys, maybe it's my problem. Maybe I'm WAAAAAAAY too addicted to the idea of maxing out my cpu 100%, but I'm frustrated.

I am oh so willing to give Eon all the processor it wants when it wants it, but if there is more than, say an hour, of downtime in the average 24 hour period, I would like a simple way to pass it along without having to use some obscure program. I've noticed that there's a feature to change Eon's priority. I also noticed it seems to be disabled.

I am noticing way too much idle time is coming my way. Here's what's going to happen:

I am going to a different project, but I'm only going to reserve a couple days work at a time. If the problem is solved in a timely fashion, I'll come back, otherwise I'm staying away permanently.

And guys, you brought this on yourselves. There are plenty of anal retentives like me that want to max out their cpu and don't like idle time. And, as an afterthought, even if a lot of people only give out 50% of their cpu, what's to say you don't get more than twice as many users? And a lot of the people, like me for instance, are willing to give Eon 100% when it wants it, so you lose simply because I'm angry about the idle time my cpu is getting.

I'll come back in a heartbeat if you solve the problem. And the problem is, and I feel I need to restate it:There is no easy way to change Eon's priority settings in a way that'll stick.

magnav0x
06-30-2005, 06:49 PM
Exactly why I stopped crunching EON. Come over and help on SoB some :p

jasong
06-30-2005, 06:52 PM
Originally posted by magnav0x
Exactly why I stopped crunching EON. Come over and help on SoB some :p
Actually, I chose Riesel Sieve sieving.

Similar concept, different numbers.

black_civic55
06-30-2005, 07:05 PM
there really isnt that much down time, there just having some troubles right now. keep it running and when you see it down run something else

graeme
06-30-2005, 08:29 PM
We're quite sure it's a memory problem at the moment, and new memory is on order to get in asap. Sorry about this, I really felt that we were doing a good job keeping the down time to a minimum lately.

I'll get onto this priority problem with the windows client right away. But I think the down time does not have much to do with the priority setting. Windows allows the client to use available cpu time with the default priority. Setting it higher tends to cause the client to interfere with normal windows functions.

There are different kinds of down time that we've had on the server over the past year. One has to do with communication problems and a memory leak on the server. We have this under control now. Another has been hardware problems or power outs, which are bound to happen once in a while. Another is the periodic down time while the server does a serial calculation on it's own. For the simulation we are currently running, this is an essential part. Ideally this should take less than 20 minutes out of a day.

I understand the drive to get the most cpu usage out of computers. This is one important part of doing a valuable calculation. The other is trying to use computational resources to learn as much as possible. We don't want to focus on the former in spite of the latter. If statistics are the only motivating factor, this is irrelevant, but if you want to do as much as you can with your computer, it does matter. Not to pick on some distributed computing projects, but does anyone ask about the value of numerically solving specific mathematical relations, or factoring large numbers? There is certainly value in learning how to do it, but to actually spend enormous computation time to repeat these calculations over and over again with different numbers seems crazy to me -- even if you get 100% efficient use of your cpu. Anyways, sorry for the rant, which is a little out of place here. We'll solve the problems with our system as quickly as possible, and try to minimize down time.

vaughan
06-30-2005, 09:16 PM
Thanks for the update Graeme. We are waiting patiently for you to get the new memory installed and the project back on-line again. :cool:

PY 222
07-01-2005, 02:34 PM
Is the project back online?

Any ETA on when it'll be back?

omer
07-01-2005, 04:49 PM
What I can do is to provide an interface which will do the following for you:

1. EON running at a priority X (which is less than the other normal programs)
2. Specify a program which will run at a priority < X (any executable like c:\run.exe)

so both programs will be running at all times. When EON needs its cycles, it will take all of them or something like 99% of them, and when ofcourse it is sleeping or waiting or anything, the other program will take the cycles. Good enough?

graeme
07-01-2005, 05:03 PM
Thanks for your patience. ETA for the memory is tuesday, so the server will be down for the weekend. For those in the US, have a great (eon free) July 4th.

jasong
07-01-2005, 05:04 PM
Originally posted by omer
What I can do is to provide an interface which will do the following for you:

1. EON running at a priority X (which is less than the other normal programs)
2. Specify a program which will run at a priority < X (any executable like c:\run.exe)

so both programs will be running at all times. When EON needs its cycles, it will take all of them or something like 99% of them, and when ofcourse it is sleeping or waiting or anything, the other program will take the cycles. Good enough?

That would be fabulous.

Your project seems to be very worthy of my cycles. This addition will probably give you a steadily increasing influx of users over the next couple months at least.

Thank you

rsbriggs
07-01-2005, 05:13 PM
And please don't forget to add some sort of benchmark functionality (or at least show elapsed time in the output) to help those of us that tend to tweak CPUs or OSes for a given project!

Maybe time n operations/forcing calls on a set of known data? Anything that allows an "apples-to-apples" performance comparison of two different boxes..

PY 222
07-01-2005, 05:55 PM
Originally posted by graeme
Thanks for your patience. ETA for the memory is tuesday, so the server will be down for the weekend. For those in the US, have a great (eon free) July 4th.

Thanks for the heads up. I am going back to Free-DC's home project FaD for the time being.

Let me know when you are ready for more firepower ok. :thumbs:

jasong
07-07-2005, 05:11 PM
Just as a heads-up, I check this forum about once a day to look for updates on this issue, so if you're twiddling your thumbs, well...

Not that I'm suggesting anything, just wanted to let you guys know I'm in the wings, waiting.:D

graeme
07-07-2005, 05:15 PM
we're back

jasong
07-07-2005, 05:21 PM
Originally posted by graeme
I understand the drive to get the most cpu usage out of computers. This is one important part of doing a valuable calculation. The other is trying to use computational resources to learn as much as possible. We don't want to focus on the former in spite of the latter. If statistics are the only motivating factor, this is irrelevant, but if you want to do as much as you can with your computer, it does matter. Not to pick on some distributed computing projects, but does anyone ask about the value of numerically solving specific mathematical relations, or factoring large numbers? There is certainly value in learning how to do it, but to actually spend enormous computation time to repeat these calculations over and over again with different numbers seems crazy to me -- even if you get 100% efficient use of your cpu.
In my opinion, there's a simple answer to that question, and it involves human nature:

First, you discover a hobby that only increases your electrical bill about $2.50 per computer. After a couple months, some people(like me) become addicted, and then...Well, have you ever been to Vegas? I haven't, but I can picture people at the slots, slowly feeding money into the machines.

It's kind of like that, whether or not the calculations are useful is only of partial relevance, depending on the psychological makeup of the individual, their education level, hobbies beforehand, etc. etc. etc.

In my case, I like to be gratified by my hobby multiple times per day. Uploading sieved values does that for me.

graeme
07-07-2005, 05:32 PM
True, there are lots of different motivations for working on these projects.

Let me also say again that I should not have been negative about the purely numerical projects. I really know nothing about them, and there could be research reasons for doing those kinds of calculations, even besides the general joy of distributed computing. I was really just grumpy about our own damn server problems. I'm happier now that it's working again.

Thor
07-07-2005, 05:59 PM
Since I didn't beat you in anouncing the good news, I have a question:

Are the stats being generated? Should have been updated by now...


:Pokes:

Thor

graeme
07-07-2005, 06:12 PM
This is ongoing evidence that the eon users are usually one step ahead of the admin.

The history stats are now updating. On the eon site, you can now see how has done work units since the server started. I'll keep an eye on the Free-DC stats page to make sure it gets going as well.

Thor
07-07-2005, 06:34 PM
The User is always ahead of the admin.. My dad always finds a way to break his PC, ways I didn't even dream about before...
Must be a bad aura...:rotfl:


By the way, free-dc stats are updating so everything seems to work alright.


Thanks:thumbs:

Thor

rbutcher
07-07-2005, 10:13 PM
Here in Australia I still can't connect, neither to the eon website nor the dimer server. ??

graeme
07-07-2005, 11:57 PM
Ah, I saw your message earlier today and assumed it was some glitch, but I just realized what the problem was. Before I figured out that the problem on the server was bad memory, I was thinking that the problem could be due to a rouge client. So I was temporarily blocking different clients to see if one was causing the server to hang. None did, of course, but I accidentally left the server in a state blocking (I assume) your IP. Very sorry about this. I realized what was going on because you mentioned that you could not access the web page. I've changed the configuration, and it should be back to normal.

Mustard
07-07-2005, 11:57 PM
Then something is broken between you and the server cause it's working fine for me.

Mustard
07-08-2005, 12:08 AM
How-some-ever.............. my clients are working like dogs, but no points are accumulating????????? So moving stuff back to another project. :(

graeme
07-08-2005, 12:09 AM
It's me, I broke it.

BTW, as I was stuggling to fix the server last week, I kept thinking about how a bad client could really mess things up. I sincerly want to thank everyone for being so nice to the server and helping the project. Since the project is open source, it would be very easy to mess with, and I don't think anyone has done this. I am very grateful that we haven't had to worry about double checking data or the other paranoid things that the seti folk have had to do. Thank you all.

graeme
07-08-2005, 12:24 AM
Lexx, I made one error, and shifted the logs by a day on the eon site. I also hung mysql for a moment, but I think the total number of work units done should be correct. Are you sure the total number is wrong? Could you perhaps start one client to make sure things are updating properly?

PY 222
07-08-2005, 06:55 PM
Originally posted by graeme
Ah, I saw your message earlier today and assumed it was some glitch, but I just realized what the problem was. Before I figured out that the problem on the server was bad memory, I was thinking that the problem could be due to a rouge client. So I was temporarily blocking different clients to see if one was causing the server to hang. None did, of course, but I accidentally left the server in a state blocking (I assume) your IP. Very sorry about this. I realized what was going on because you mentioned that you could not access the web page. I've changed the configuration, and it should be back to normal.

I;ve just tried restarting the client on two of my boxes and they are getting ZZZZ.

So I am guessing that you are blocking me as well?

graeme
07-08-2005, 07:05 PM
no, it's some glitch in the automatic restart. I'll get right on it -- and thanks for the head's up.

PY 222
07-08-2005, 07:39 PM
Originally posted by graeme
no, it's some glitch in the automatic restart. I'll get right on it -- and thanks for the head's up.

Ok great. The clients are working again.

I've thrown in 2 puters for now and both are working nicely. Maybe once it gets stable enough, I'll start putting in more GHz.

Great work guys. :thumbs:

Mustard
07-09-2005, 12:22 AM
Well I'm back at the point of clients running, but no points being generated again... have moved stuff back to OGR.

graeme
07-09-2005, 12:30 AM
Yeah, something is getting stuck with mysql. However, I'm quite sure that we don't lose any work units. For example, right now there was a significant jump in the stats as everyone got updated. I'll keep a closer eye on it.

Mustard
07-09-2005, 12:33 AM
heh heh............... you know, if it was a horse, I'd say trade it in on a new one! LOL