PDA

View Full Version : Current Outage of All that is Rieselsieve



bryan[RS]
12-04-2005, 11:39 PM
As you may (or may not have noticed), all of the RieselSieve Project has been down since about noon eastern. Some time this morning, our DNS was poisoned, and we have been working since to change nameservers, get dns records re-propogated, and try and determine how exaclty everything broke. That is why things come up as RieselSieve.com being non-existent, not just unresponsive. All servers are up and running, and as soon as the DNS comes back online, crunching will resume. Lee, Sean and I have been working, as I said, all day on this - and we're really sorry about the downtime. This is the first non-software related problem we've had, and it's hitting us hard. Hopefully before I leave for school tomorrow things will be back online. If they are, I'll post here...otherwise I'll post an update when I get home (6pm EST or so).

Thanks all for your patience.

Bryan
Stats Administrator
RieselSieve Project

AMDave
12-05-2005, 04:34 AM
:cry: my cache just ran out.
can you suspend the WU timeouts ?

good luck bryan, Lee & team
hope you can sort it all out soon

We'll be here for you when your DNS is back

ladypcer
12-05-2005, 04:54 AM
I tried to check into this project yesterday and the site was down.
I hope you get it all sorted, and I'll be sure to check into it when it's back up and running.

bryan[RS]
12-05-2005, 06:54 AM
Jeff,

Thanks for the idea. I can't suspend them, but I can delay them. All WU's from prior to 11-30-05 have been set as being issued on 11-30-05. This means all jobs are no more than 5 days old. If needed, I'll reset the times again, but hopefully things will be back to normal in a few more hours.

As of 6:55 EST, things still aren't back to normal. Ladypacer - thanks for the interest, and sorry for the lousy first impression. Things normally run much better. :blush:

Bryan

ladypcer
12-05-2005, 07:19 AM
It's ok. I'm a veteran of several DC projects and know that things don't always go as planned. I never judge a project by one day's events.
:cool:

Mustard
12-05-2005, 11:38 AM
How about posting what your IP addresses are supposed to be so that they could be added to our host files to get over this hump????? Specially if things don't clear up today?

Bruce

bryan[RS]
12-05-2005, 06:10 PM
Bruce,

Here's some IP information. One problem is that we use dynamically updating DNS, so these numbers may change (I will change them here as soon as I find out they've changed. I'm also setting up a no-ip failover DNS, as kind of a band-aid, but no direct number, DNS fix. Why the propogation hasn't happened yet, I don't know, but I'm looking into it. Something is still fishy.

Current IP Address for the Main LLRNet Server (this changes): 216.196.208.192
Please consider using the address llrnet.redirectme.net.

Current IP Address for the Double-Check LLRNet Server (this is an almost static IP): 24.210.29.64
Please consider using the address dc-llrnet.redirectme.net.


The latest update from versigin says that Godaddy has put the domain into Registrar-Hold. We're trying to work with them and our DNS providers to get this fixed. Lee has already talked with Godaddy and they are working on a solution, and we're looking for other routes. Again, on behalf of Lee, Sean, and myself, we're really sorry about this - and hopefully people don't go running from the project because of this hiccup.

Bryan

Stats Administrator
RieselSieve Project


****************
Technical stuff follows:

Any DNS experts out there that can lend a hand? Or does anyone run a nameserver that I can check a query on? Here's the problem I think:


Out nameservers are now from dnsmadeeasy.com, and by using dig, here's what I find:

$ dig @ns0.dnsmadeeasy.com rieselsieve.com

; <<>> DiG 9.3.1 <<>> @ns0.dnsmadeeasy.com rieselsieve.com
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30059
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 5

;; QUESTION SECTION:
;rieselsieve.com. IN A

;; ANSWER SECTION:
rieselsieve.com. 0 IN A 10.185.17.194

;; AUTHORITY SECTION:
rieselsieve.com. 86400 IN NS ns2.dnsmadeeasy.com.
rieselsieve.com. 86400 IN NS ns3.dnsmadeeasy.com.
rieselsieve.com. 86400 IN NS ns4.dnsmadeeasy.com.
rieselsieve.com. 86400 IN NS ns0.dnsmadeeasy.com.
rieselsieve.com. 86400 IN NS ns1.dnsmadeeasy.com.

;; ADDITIONAL SECTION:
ns0.dnsmadeeasy.com. 86400 IN A 63.219.151.3
ns1.dnsmadeeasy.com. 86400 IN A 205.234.154.1
ns2.dnsmadeeasy.com. 86400 IN A 66.117.40.198
ns3.dnsmadeeasy.com. 86400 IN A 216.129.109.1
ns4.dnsmadeeasy.com. 86400 IN A 69.26.190.254

;; Query time: 41 msec
;; SERVER: 63.219.151.3#53(63.219.151.3)
;; WHEN: Mon Dec 5 17:53:24 2005
;; MSG SIZE rcvd: 231

(Correct information is not getting passed. However, look at what xname says:

$ dig @ns0.xname.org rieselsieve.com

; <<>> DiG 9.3.1 <<>> @ns0.xname.org rieselsieve.com
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29656
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;rieselsieve.com. IN A

;; ANSWER SECTION:
rieselsieve.com. 0 IN A 10.185.80.234

;; AUTHORITY SECTION:
rieselsieve.com. 86400 IN NS ns1.xname.org.
rieselsieve.com. 86400 IN NS ns0.xname.org.

;; ADDITIONAL SECTION:
ns0.xname.org. 600 IN A 195.234.42.1
ns1.xname.org. 600 IN A 193.218.105.149

;; Query time: 138 msec
;; SERVER: 195.234.42.1#53(195.234.42.1)
;; WHEN: Mon Dec 5 17:54:48 2005
;; MSG SIZE rcvd: 126


How is xname declaring itself to be authoritative? And what's with these private IP's popping up?

bryan[RS]
12-05-2005, 10:11 PM
Another update, before Lee arrives home from work. We're still running on those no-ip servers:

Main Website (Including Forums): http://rieselsieve.redirectme.net
Stats Website: http://llrnet.redirectme.net

LLRNet Main Server: llrnet.redirectme.net, port 7000, or use 216.196.208.192
LLRNet Double Check Server: dc-llrnet.redirectme.net, port 7000 or use 24.210.29.64

Updates will follow as we get information from Godaddy.

Bryan

Mustard
12-05-2005, 11:13 PM
what he is saying in english is that if you are running llrnet, add the following to your /etc/hosts file in linux or whatever

24.210.29.64 dc-llrnet.redirectme.net
216.196.208.192 llrnet.redirectme.net
216.196.208.192 llrnet.rieselsieve.com



and then you can at least connect to the llrnet server and upload results and suck down more work. You should probably go into the llr-con... file and increase your cache to 35 or so, in case this continues.

Also, this will crap out at some point when the new dns takes over, so you will have to delete it or comment it out from your /etc/hosts file.

Bruce

Mustard
12-06-2005, 01:01 PM
Well no method works to connect now. And like a fool I stopped a llrnet system to try and get some more units for the cache, and *voila* now I can't load one of the libraries, so even with stuff in the cache, it no workie... :(

Mustard
12-06-2005, 03:44 PM
I guess what would be nice to know now would be what the ip addresses of the new revamped dns stuff is supposed to be...

bryan[RS]
12-06-2005, 05:54 PM
Sorry about the delay - since I'm at school from 7:00 - 18:00 EST, it's hard to keep up with daytime changes - which is why I left the redirectme.net address that could be queried.

Here's the latest IP addresses - Godaddy said everything should work, once the waiting time (up to 48 hours from about 10 pm Sunday night) runs out. So, we'll see in the next 6 hours or so how things are.

216.196.211.194 rieselsieve.com
216.196.211.194 llrnet.rieselsieve.com
24.210.29.64 dc.rieselsieve.com

Note - you don't have to manually put those redirectme.net addresses in your hosts files - they dynamically update from No-ip.com when a dns change is made.

Hopefully this is the last few hours we have to put up with this disaster - thanks all for your patience.

Bryan

Stats Administrator
RieselSieve Project

Mustard
12-07-2005, 11:50 AM
Originally posted by bryan[RS]

Note - you don't have to manually put those redirectme.net addresses in your hosts files - they dynamically update from No-ip.com when a dns change is made.




I stuffed them there Bryan cause my stuff worked with them and I had no idea what's what in your dns system. I'm glad that you posted the new dns info with what needs to be there. This is a good lesson though in why projects should run with all fixed-IP addresses. Participants can make use of this information to keep working when there are dns hassles, whether on the project end, or on the participants end. Personally I;ve had multiple instances this past year when my ISP's dns servers have had issues.

Anyway, as you mentioned in one of your posts, I too hope that people don't leave the project because of this. We've been making great progress, and it would be a shame to see it slow down.

Bruce

maefly
12-07-2005, 06:48 PM
Well, it looks like you're still having some trouble, Bryan :(
I'm sure you're very frustrated at this point. :hair:

I went ahead and updated all my boxen with llrnet.redirectme.net, so you'll get some production during this DNS madness.

I hope you resolve your DNS issues soon!

Jeff
(maefly)

Mustard
12-07-2005, 07:42 PM
Yup, more problems it appears...... I can't get through with any of the IP addresses you handed over now Bryan, where the latest IP's worked earlier today.

and a system that just relies on the standard internet dns doesn't respond to anything either. So I don't think you could have hosed this up like this if you had tried to hose it up.... <sick humor>

bryan[RS]
12-07-2005, 09:33 PM
Sorry, looks like another IP refresh. Here's the new ip's:

216.68.186.52 rieselsieve.com
216.68.186.52 llrnet.rieselsieve.com
24.210.29.64 dc.rieselsieve.com

Godaddy initially blamed our problems on our DNS provider. When Lee finally convinced them that REGISTRAR-HOLD kept us out of the Verisign Root File, (therefore removing us from the Internet), they apologized and agreed that it was their fault. As of this afternoon, they had started to process the change, removing the hold, and beginning the propogation process - since it had not really started yet, we're still talking up to 36-48 hours before this mess is fixed. Obviously, since the nameservers are live, we're hoping for a much faster turnaround.

I will say we've already started a discussion about how to avoid this problem from occuring again. One thing that I am pushing for (and had been a previous idea) is to have built-in failover servers in the llrnet client. This, of course, needs to be programmed. Any assistance with LUA programming would be greatly appreciated, if anyone can help :) I'm going to try to look into this myself.

Another thing will be changing Lee to a static connection in January. He's working that out with his ISP, and we're looking forward to the change. One more change is the possibility of registering a mirror domain (i.e, rieselsieve.net) and having the DNS go through another provider. This, of course, would only be useful if the failover server I mentioned above is worked out. Also, team proxies are on the horizon - I've found the code that sets up how a proxy is handled, now I just have to figure out how to communicate the user info individually. It's going to take some time.

Thanks for everyone's support and watchful eyes during this difficult time. Hopefully this clears out any problems that we would have faced in 2006....and beyond. Hold on - just a little more time till we're back.

Bryan
Stats Administrator
RieselSieve Project

Mustard
12-07-2005, 10:00 PM
Well there is one more little issue you need to look hard at..... and that is that (linux wise anyway) that the client needs to be a static compile, with any required libraries residing on the client hard drive. I had work units, but I couldn't kick off because it would get to the point of loading win32 libraries and everything would crap out from there on, without a working connection to your server. And sorta curious too as to why it even says that in the linux llrnet client anyway???????

Bruce

bryan[RS]
12-07-2005, 10:34 PM
Bruce,

AFAIK, the client had all of its libraries. It's asking to use the Win32 binaries? Weird, I'll have to look into that - I do know, though, that it doesn't need to connect to the server to get any binaries. So, something is fishy...I'll ask around.

Bryan

bryan[RS]
12-08-2005, 06:24 PM
New IP's again. Godaddy has communicated with Internic, and hopefully this gets flushed out soon.

216.196.208.201 rieselsieve.com
216.196.208.201 llrnet.rieselsieve.com
24.210.29.64 dc.rieselsieve.com

Bryan

bryan[RS]
12-09-2005, 04:30 PM
Well, I finally have some good news. Our Zone File has been created at the root nameservers. This means that we're going to be out of the woods soon. We just have to wait for the ip's to propogate through - and all will be back on line. I know right now the dynamic IP doesn't seem to have updated, so some servers are resolving rieselsieve.com incorrectly. One more HOSTS file update, and then we'll be able to use regular DNS around midnight EST (5:00 GMT). Thanks again everyone!

216.196.209.194 rieselsieve.com
216.196.209.194 llrnet.rieselsieve.com
24.210.29.64 dc.rieselsieve.com


Bryan

CaptainMooseInc
12-09-2005, 05:10 PM
I tried getting on the site again today...this time it didn't instantly goto that "action cancelled" screen...now it's trying to get the server at least...didn't load yet but it's showing signs that it's coming back.

-Jeff

b2uc
12-10-2005, 01:27 AM
Everything should be back to normal...hopefully:)

Sorry for the downtime everyone..it has been a very trying time for me this week.

Lee