PDA

View Full Version : RNA World



Saenger
01-04-2010, 12:47 PM
RNA World (http://www.rnaworld.de/rnaworld/stats)
A new project from the orbit of Rechenkraft (http://www.rechenkraft.net).


About RNA World (beta)
RNA World (beta) is a distributed supercomputer that uses Internet-connected computers to advance RNA-related research. You can participate by downloading and running a free program on your computer.

RNA World (beta) is based at the Rechenkraft.net e.V. (http://www.rechenkraft.net) research facility located in Germany.

* Projectdescription (http://www.rechenkraft.net/wiki/index.php?title=RNA_World/Projectdescription/en)
* Project location & personnel (http://www.rechenkraft.net/wiki/index.php?title=RNA_World/Project_location_%26_personnel/en)
* Cooperation partners (http://www.rechenkraft.net/wiki/index.php?title=RNA_World/Cooperation_partners/en)
* Sponsors (http://www.rechenkraft.net/wiki/index.php?title=RNA_World/Sponsors/en)
* FAQ (http://www.rechenkraft.net/wiki/index.php?title=RNA_World/FAQ/en)
* Presentation of the project (http://boinc.berkeley.edu/trac/attachment/wiki/WorkShop09/RNA%20World_MW-Talk_23-10-2009_Barcelona_final_mod4PDF.pdf) at the 5th Pan-Galactic BOINC Workshop (http://boinc.berkeley.edu/trac/wiki/WorkShop09)

The permission to publish stats was given today. (http://www.rechenkraft.net/phpBB/viewtopic.php?f=76&t=10513)

Bok
01-04-2010, 01:49 PM
Yup, already done :) - http://stats.free-dc.org/stats.php?page=proj&proj=rna

Please note that this is not open to the public just yet though. Soon, I'm told :) Seems to run pretty well so far..

zombie67
01-04-2010, 11:39 PM
Tell me about this RNA, of which you speak! ;)

LAURENU2
01-05-2010, 12:09 AM
Tell me about this RNA, of which you speak! ;)
RNA World project description
RNA World is a distributed supercomputer that uses Internet-connected computers to advance RNA research. This system is dedicated to identify, analyze, structurally predict and design RNA molecules on the basis of established bioinformatics software in a high-performance, high-throughput fashion.

In contrast to classical bioinformatic approaches, RNA World does not rely on individual desktop computers, web servers or supercomputers. Instead, it represents a continuously evolving cluster of world-wide distributed machines of any type. As such, RNA World is very heterogenous and, depending on the sub-project, currently addresses Internet-connected computers running Linux, Windows and OSX operating systems - your computer could be an important part of it. The fact that hardware and electricity costs are shared among the volunteer contributors raises the possibility of performing interesting analyses which under economical aspects would often not be affordable. In return, RNA World is not for profit, exclusively uses open source code and will make its results available to the public.

Sounds good to me I am asking for a invite

Bok
01-05-2010, 07:09 AM
Ribo-Nucleic Acid :p

LAURENU2
01-05-2010, 10:09 AM
Ribo-Nucleic Acid :p

Is that the invite code Ribo-Nucleic Acid :lmao:

Bigred
01-05-2010, 03:19 PM
Is that the invite code Ribo-Nucleic Acid :lmao:

Didn't work for me.:p

LAURENU2
01-05-2010, 08:12 PM
Didn't work for me.:p
Ya they are just Not accepting any new members :bang:

Saenger
01-06-2010, 11:43 AM
Ya they are just Not accepting any new members :bang:
The chief (Dr. Michael Weber) just said (http://www.rechenkraft.net/phpBB/viewtopic.php?p=114179#p114179) that the next bunch of accounts will be available sometimes this month probably.

Digital Parasite
01-06-2010, 12:28 PM
The chief (Dr. Michael Weber) just said (http://www.rechenkraft.net/phpBB/viewtopic.php?p=114179#p114179) that the next bunch of accounts will be available sometimes this month probably.

Interesting so that is what he is up to now. I just signed a release form for his new book on Distributed Computing that has articles from a lot of people in the DC scene. It will be available with a CC license I think so anyone can download it and read.

Jeff.

LAURENU2
01-06-2010, 10:39 PM
Interesting so that is what he is up to now. I just signed a release form for his new book on Distributed Computing that has articles from a lot of people in the DC scene. It will be available with a CC license I think so anyone can download it and read.

Jeff.

:hair:Are they all calling me Crazy Again :crazy:

yoyo
01-07-2010, 01:01 PM
Hi,
currently we are struggling a bit with homogenous redundancy and validating results. We do not want to be unfriendly to the users if they get invalid results. Therefore the project isn't open.
yoyo

Angus
01-11-2010, 10:55 AM
Back to the original issue of rechenkraft racking up big credit while no one else can participate -

Will you reset credits when RNA goes public, so the rechenkraft team doesn't have a totally unfair head start?

Bok
01-11-2010, 11:13 AM
Personally I would vote NOT to reset the credits, knowing it's added work in order to achieve this.

Once they go live and as long as there is lots of wu's, some of the bigger teams out there like SETI.USA, L'Alliance Francophone and Seti Germany to name just a few would probably get up to the same levels within a few days anyway so it would not make the slightest bit of difference in the long run.

Just my own opinion though.

Bok

LAURENU2
01-11-2010, 09:33 PM
Personally I would vote NOT to reset the credits, knowing it's added work in order to achieve this.

Once they go live and as long as there is lots of wu's, some of the bigger teams out there like SETI.USA, L'Alliance Francophone and Seti Germany to name just a few would probably get up to the same levels within a few days anyway so it would not make the slightest bit of difference in the long run.

Just my own opinion though.

Bok
Hay Bok you forgot yo mention LAURENU2 :lmao:

Bok
01-11-2010, 10:51 PM
Hay Bok you forgot yo mention LAURENU2 :lmao:

Oh I thought of it, but I kept Free-DC out :)

LAURENU2
01-11-2010, 10:55 PM
:thumbs::jester::lmao:

Angus
01-14-2010, 11:11 AM
I see this thread has now been censored into banality.

Bok
01-14-2010, 11:13 AM
No censuring of posts has occurred that I'm aware of. All posts off-topic from this thread were just moved to another. I'll remove yours and this one too at some point soon.

Stay on topic please.

Angus
01-14-2010, 11:14 AM
No censuring of posts has occurred that I'm aware of.

This thread used to be two pages, and there are a LOT of missing posts. Where are they now?

Bok
01-14-2010, 11:24 AM
re-read my post above that I edited to explain..

new thread is here (http://www.free-dc.org/forum/showthread.php?t=21811)

Angus
01-14-2010, 11:42 AM
I made the mistake of looking in the yoyo@home section. silly me.

Angus
01-15-2010, 09:05 PM
Someone tell yoyo that his database is in distress. Web site and BM both get errors.

1/15/2010 5:55:23 PM|RNA World|Message from server: Server can't open database

Michael H.W. Weber
01-16-2010, 11:30 AM
Hi guys! ;)


Will you reset credits when RNA goes public, so the rechenkraft team doesn't have a totally unfair head start?
Although we have not yet discussed this in detail, there are no plans to bring credits back to zero when the project goes public. The reason for this is fairness, because these people that helped during development deserve at least to keep their credits which they had earned by carrying out hard work. It should also be kept in mind that most of them contributed much more than CPU cycles since they identified a number of issues and even helped actively resolving them. I am sure you will understand that.


Someone tell yoyo that his database is in distress. Web site and BM both get errors.

1/15/2010 5:55:23 PM|RNA World|Message from server: Server can't open database
Thanks for the information. Issue resolved. ;)
Best regards,
Michael.

P.S.: An exact date for going public is not yet fixed since it depends on the outcome of the final test rounds we are currently cycling through. ;) I actually wanted to make the project go public in November right after my initial project presentation on the BOINC Workshop held in Barcelona last year in October but then a new version of the implemented software came out and we decided to first incorporate that and test it. We just like to be up to date. :D

P.P.S.: If anyone has questions concerning the RNA World project, just go ahead and ask me. :D

Bok
01-16-2010, 01:18 PM
Thanks for coming on over Michael!

Looking forward to the project running :)

Bok :thumbs:

Angus
01-16-2010, 07:07 PM
Since the 68 or so users in the last 24 hours brought the project's server to it's knees, and the few thousand WUs that were processed filled up their disk, I think they are not quite ready for prime time.

How can anyone go so wrong in the specification for a server?

yoyo
01-17-2010, 02:01 AM
Since the 68 or so users in the last 24 hours brought the project's server to it's knees, and the few thousand WUs that were processed filled up their disk, I think they are not quite ready for prime time.

How can anyone go so wrong in the specification for a server?

We are not so well financed as a university. So we used the equipment which we have already. Now you see also why we do not going open, which you claimed above in this thread.
yoyo

Michael H.W. Weber
01-17-2010, 06:55 AM
Thanks for coming on over Michael!

Looking forward to the project running :)

Bok :thumbs:
You are welcome! :D And thanks for your help with RNA World. It indeed has been quite some time since I last visited this forum, but think I will come for a visit more frequently.


Since the 68 or so users in the last 24 hours brought the project's server to it's knees, and the few thousand WUs that were processed filled up their disk, I think they are not quite ready for prime time.

How can anyone go so wrong in the specification for a server?
Firstly, please note that this server also hosts the Yoyo@home project. Then, these "68 or so users" assigned approx. 400 machines to RNA World, most of which quite powerful. However, it is exactly this why we are still in the testing phase and we have already decided to migrate the RNA World project to a much more powerful machine as a consequence of our testing. ;)


We are not so well financed as a university.
In fact, we have no external funding at all. RNA World at present is financed completely on the basis of Rechenkraft.net mebership fees (of only 2,50 € per month; tax-deductible of course). But as soon as first publishable results pour in, we hope to acquire some additional funding.

Michael.

LAURENU2
01-17-2010, 11:39 AM
Long time no see Dr. Michael H.W. Weber :hiya:
Glad to see your still in the game.

Will you be looking into a cuda App for your project?

And when your really ready to test your platform send me a Invite,
(I sent you a request but with no reply from you)
And I will Port 160 cores (400+ Ghz) to see if your site is ready for DC-ing.

Michael H.W. Weber
01-17-2010, 01:12 PM
Long time no see Dr. Michael H.W. Weber :hiya:
Glad to see your still in the game.

Will you be looking into a cuda App for your project?
RNA World will incorporate several additional apps in the future, and, of course, if it turns out that GPU usage is beneficial in terms of computational output we will do our best to make use of it. ;) At present, however, the implemented programs apparently do not profit from GPUs - they have been tested for it. So, currently GPUs have to stay aside.


And when your really ready to test your platform send me a Invite,
(I sent you a request but with no reply from you)
And I will Port 160 cores (400+ Ghz) to see if your site is ready for DC-ing.
Excellent! :thumbs:

To whom exactly did you send your request (to which email address)? I have actually replied to all requests as mandatory for good project support. ;) But maybe something escaped my attention (e.g. spam folder or something of this sort). However, at present without exceptions (for fairness reasons) we have to reject all further participant requests except the team captains (which are automatically imported). There are some final tests going on including resolving a server issue, and, although spare-time operated, we don't want to start with avoidable glitches wherever possible. ;D

Michael.

Angus
01-17-2010, 01:38 PM
<grinch mode on>

Let's debunk some of the grandiose aura of this "RNA WORLD" thing...

"Rechenkraft.net e.V. research facility" is simply some members of a DC team with a server or two either in someones apartment or a co-lo site somewhere in Germany (more likely, since yoyo talks about "renting" another server). There is no "facility" - no pictures of a modernistic lab or ivy-covered science building.

They have no connection to any scientific endeavor - no university, no visible principle scientist, not even a grad student doing his thesis for which he needs this research done.

They seemed to have picked an existing program and wrapped it in BOINC, to do research for which there is no current customer.

They set this thing up, made a pitch for it at the BOINC Pan-Galactic Conference (don't get me started on DA's obvious megalomania), poked at the app for a few months, then imported all the BOINC-wide teams, so now everyone in the BOINC universe is ready and anxious to participate.

But - it isn't ready. They can only let one user per team participate, and it has to be the team founder. No chance for a team to assign one selected active user.

The first sort-of live test falls flat on it's face, because this thing is installed on the same server as yoyo's other wrapper efforts, and somehow the "yoyo@home" project has been assimilated into this Rechenkraft.net e.V. research facility entity.


Now - they want the public to fund this this with thing by donating so they can "rent" (?) a server and bandwidth for one year. What happens at the end of that year, if the apps aren't running, and they haven't been able to sell any tangible results to anyone? another donation campaign?

Maybe we should all just pay them to add BOINC credits to our account, because that's all that's happening here, folks! They are selling credits!

<grinch mode off>


Have at it, moderators.

Bok
01-17-2010, 01:48 PM
Have at it, moderators.

Ok if that's what you want as you obviously intend to irritate any and all.

If you have nothing positive to contribute, please don't. You seem to have a personal vendetta against Rechenkraft/Yoyo/RNA world and I'm getting somewhat tired of hearing it. All you are achieving is to make Free-DC look bad.

Further action pending after I talk with the other admins.

Bok

Angus
01-17-2010, 02:02 PM
I guess transparency isn't so much encouraged here any more. The "Free" seems to have disappeared from Free-DC where we could freely and openly discuss anything about any project, and be honest with one another. It seems now that any nay-saying is not allowed.



I would have posted on their project site, but they don't have an integrated BOINC message board - only some off-site board somewhere that requires a separate account to post.

outlnder
01-17-2010, 03:26 PM
I, personally, don't feel any animosity against Free-DC for allowing Angus to pontificate his opinions. However, it is really making Angus look like an ass. If he wants to continue his childish rants against RNA, let him. He is the only person that looks stupid.

LAURENU2
01-17-2010, 09:28 PM
<grinch mode on>

<grinch mode off>
I prefer <grinch mode off> Myself
:|ot|:
Sorry :bonk:

Angus
01-17-2010, 09:38 PM
I prefer <grinch mode off> Myself
:|ot|:
Sorry :bonk:

I was led astray by this: Grumpy is good for you (http://news.bbc.co.uk/2/hi/8339647.stm)

still :|ot|:

LAURENU2
01-17-2010, 10:39 PM
Excellent! :thumbs:

To whom exactly did you send your request (to which email address)? I have actually replied to all requests as mandatory for good project support. ;) But maybe something escaped my attention (e.g. spam folder or something of this sort). However, at present without exceptions (for fairness reasons) we have to reject all further participant requests except the team captains (which are automatically imported). There are some final tests going on including resolving a server issue, and, although spare-time operated, we don't want to start with avoidable glitches wherever possible. ;D

Michael.

I am Sorry I think I misspoke I think I only asked in your forum and I did get a responseIt was another project I confused your's with:blush:
You have always in the past given good project support.for the projects you worked on:thumbs:

Michael H.W. Weber
01-18-2010, 06:19 AM
I am Sorry I think I misspoke I think I only asked in your forum and I did get a responseIt was another project I confused your's with:blush:
Well, then everything seems just fine, I guess. :thumbs:


You have always in the past given good project support.for the projects you worked on:thumbs:
Oh yes, I like DC a lot, so I try to put in my time whenever possible. Thank you for that nice remark. ;)

@Angus: Just a few things for clarification (I always try to take comments serious although, sadly, sometimes people make it really very hard by the style they do communicate). RNA World is a bioinformatics project, so we do not need a lab at all. A few computers and a good webserver is sufficient. Still, as you can see from our cooperation partner list, we do clearly have the capabilities of performing lab experiments to validate our computational results. In fact, I work in an RNA lab every day with more than 10 years of practical experience in a diverse set of scientific fields. :D And yes, we do outsource server hosting to professionals while we keep server administration in house. This, to our experience, is the best and most economic way to do it. And we do indeed have at least some experience since we operate in DC for more than 10 years now. First we "only" participated in DC projects. Then we started the Yoyo@home project to help scientists not too firm with BOINC to acquire more volunteer support for their projects. We achieved that by wrapping a set of clients of non-BOINC projects we liked into the BOINC infrastructure and, apart from your crediting concerns, I think we did a nice job (and Yoyo is the driving force here). This scenario allowed us to gain a lot of practical experience over many years in this "business" (yeah, I hate that word in this context); also in terms of user requirements and support. And now, we are implementing our own scientific project in which we use established software plus own software developments to put into practice our own research ideas. You can say that we have evolved from a DC community to DC project developers. So, what is bad about that? :D

Finally, concerning myself (since you addressed this issue), maybe you have overlooked that I am not a Ph.D. student; I am holding a doc degree in natural sciences (Dr. rer. nat., chemistry) for quite some time now and I am the principal investigator at Rechenkraft.net you were seemingly looking for (and yes, I do not get paid for doing that like nobody at Rechenkraft.net gets paid for anything; it is just our spare-time interest but that has lasted in a sustained manner over a decade). That said, I would like to say that - as we witness in DC every day - volunteers can contribute significant efforts towards achieving meaningful scientific results. Hence, I do not give much on degrees (although from your writing it appeared to me that it seems important to you). Rather, a nice style of cooperative working together is what I have in mind where everybody can input small ideas. That, taken together, will ultimately give a nice overall picture. I believe, DC is not at all about competition as always claimed. If you look a bit beyond competition in terms of who has the fastest machine, you will find that a bigger goal is achieved COOPERATIVELY. And that, to date, to my opinion, is the most powerful principle in nature - not competition. But maybe let us better keep that aside for now. So to say as a philosophical side remark... :D

Michael.

gopher_yarrowzoo
01-18-2010, 06:34 AM
Micheal I do believe you've hit the protein on the amino acid (if you'll forgive the bad pun), yes I do know about Adenine, Cytosine, Guanine and Urasil (sp) notice I didn't say Thymine since well that's DNA..

Yes many do it for the "I'm bigger than you" but it's all about the science in the end, the more power gets thrown at it the quicker we get to prove / disprove the theory behind the project or prove / disprove that particular drug / protein.

Angus
01-18-2010, 12:15 PM
@MHWW thanks for clarifying some of the issues.

I spent a LOT of time yesterday moving through your web sites and pasting bits and pieces into Google Translate trying to get answers for myself. With almost all of the content in German, it's very difficult for a non-German speaker to get any kind of picture of your organization. Do you a plan to provide all the content in English?

This is further compounded by having stuff scattered around through multiple sites. There is no BOINC message board, however there appears to be a rechenkraft message board that I'm not sure is for the team or the organization, but it requires a separate login. There is also some sort of wiki, but it's almost entirely in German too.

One of the more interesting pieces might have been an "RKN roadmap" post that seems to be at the top of many areas of the message boards, but I couldn't find that in English anywhere.

Even the English home page for the RNA project still has at least some German on it, and today the server status pages are coming up in German where they were in English previously.

I think your organization is a victim of self-imposed obscurity by not providing other language content, particularly English.

Just an observation.

Saenger
01-18-2010, 12:56 PM
It is a German team, a German "Verein (http://dict.leo.org/?lp=ende&search=verein)", and of course through that tradition a German HP. I don't expect the projects to have their Sites in German, French or any other main language from the beginning as well, English only if they are in some English speaking country is fine first, Spanish for Ibercivis, Dutch for Almere, Japanese for Tanpaku, whatever. It's fine to put some main content in English as well, and with RNA World there is too much English, far too little German now, but English does definitely not have to be the main language of all websites.

Concerning the off-site forum: RNA World and Yoyo@Home are not the only ones with this setup: WCG, Simap, Spinhenge, CPDN, FreeHal, Ibercivis ahve it, and even Folding@Home outside BOINC has the forum completely outsourced. The BOINC forum software is fine for Projects, that don't already have some software running or don't want to bother with a real forum software. But it definitely isn't as good as a real one, and I prefer only one forum per project, I don't like the CPDN setup, with two different fora.

Michael H.W. Weber
01-18-2010, 05:01 PM
@Angus: Well, that is a good point. Indeed, our organization, board and everything has always focused on the German language. That is because we are Germans and there are people out there that do not understand english. We initially represented a message board where a lot of the DC-related, mostly english, stuff was translated for those that do not understand it. What you see today - I mean the english parts of the wiki for example - is the result of the efforts of just a few months to translate it piece by piece - again on a voluntary basis. It will take quite some time to complete, I fear, but we are on it. ;)

By contrast, for the RNA World project we setup dual language forums from the very beginning and I try to put important messages right into the BOINC built-in news feeder - mainly in english.

Michael.

P.S.: Concerning Rechenkraft.net e.V. status, since approx. 5 years we are an officially registered NGO, allowing donations made to us to be tax-deductible. We focus on supporting education, research and science by using networked computers.

Angus
01-21-2010, 02:26 AM
Now yoyo is fiddling with the stats:

From a post on the RKN message board in the RNA section

I increased your rac, so you can create a profile and check the pages.
yoyo

yoyo
01-21-2010, 02:56 AM
I don't changed the stats, the credits are not touched!
He is checking the russian translation. Therefore he want's to check if everything translated well, also for creating profiles. But profile creation is only possible with a rac>100 (to avoid spamming). Therefore I inceased his rac above it. I didn't change his credits or anything ones else. The rac will go down automatic in the next days if he doesn't crunch.

If you quote something, than please do not quote it without the context. This falsifies it.

And do not see in everything a complot.

What I can learn from you is to stop open communication, because you interpret everything as a conspiracy and you are falsifying the content as you need it.

yoyo

Angus
01-21-2010, 03:40 AM
I don't changed the stats, the credits are not touched!
He is checking the russian translation. Therefore he want's to check if everything translated well, also for creating profiles. But profile creation is only possible with a rac>100 (to avoid spamming). Therefore I inceased his rac above it. I didn't change his credits or anything ones else. The rac will go down automatic in the next days if he doesn't crunch.

If you quote something, than please do not quote it without the context. This falsifies it.

And do not see in everything a complot.

What I can learn from you is to stop open communication, because you interpret everything as a conspiracy and you are falsifying the content as you need it.

yoyo

The invalid RAC value is exported to every external stats site.

I don't care what the reason is - one players stats should not be increased on your whim - credit totals or RAC.

yoyo
01-21-2010, 03:51 AM
I reset this single rac this evening, when I'm back from office this evening, OK?
yoyo

gopher_yarrowzoo
01-21-2010, 05:21 AM
Yoyo - I can see the need for that, need to check translations work 100% and look ok, we have a dev server for this so we can test mods to the stats and the like are working 100% might be worth a try to set up a development server if you can and test it on there with an import of the live data so that you can increase / decrease stats to check translations then migrate the php to live since you've tested it.

Micheal - I can understand that, I do know people who have trouble getting to grips with the english language and it's their primary language.

Angus - As Yoyo said please if your going to quote from somewhere else that the person reading may / may not visit at least post a link to the full story makes it a lot easier for people to understand.

LAURENU2
01-21-2010, 10:58 PM
Come On Guys let's have a :Hugger::Hugger:
Angus RNA is Still a Baby Please let them learn to stand and walk before you smack them down
Remember You are Free-DC To Me that means something



What I can learn from you is to stop open communication, because you interpret everything as a conspiracy and you are falsifying the content as you need it.

Yoyo please do not let Angus make you feel this way:Pokes:
You do play a important part in bringing more power to the DC world :thumbs:
And I to hope you learn to evolve And bring competition back into the DC world

I am sorry AGAIN for this :|ot|:Rant But All This Bickering is starting to get to ME :mad::cry::mad:

Bigred
01-22-2010, 03:59 PM
The project isn't even open to the public yet, but it is being slammed.:stomp:


What happened to trying to help people. I guess I missed something along the way.
:brainfart:

The constant attacks are getting old and are starting to remind me of a vendetta.
:taz:

gopher_yarrowzoo
01-22-2010, 04:02 PM
Big Red, don't worry the admin team is on top of the situation and keeping an eye on everthing... Your right though.

Michael H.W. Weber
02-01-2010, 06:02 AM
The RNA World server migration is completed. :D Friday evening we started, Sunday around noon the new machine started sending out work units again. So far, we experienced no problems and with an Intel i7-920 Quad-Core, 12 GB of DDR3 RAM and a dual 1.5 TB RAID storage system the new server should not run into database issues again as was the case with the old one. We also changed everything from 32 to 64 bit which required quite some work.

Thanks a lot (!) to all donators and to Yoyo who finally put all of this quickly into practice. ;)

We are currently running a larger set of test work units to check integrity of the system. Then we need to analyze the results in detail, and, if everything runs smoothly, we will start to allow more and more people to participate in a step-wise manner - just as a precaution to avoid server overload.

Michael.

Bok
02-01-2010, 07:47 AM
nice job. All running smoothly here..

Bok :cheers:

LAURENU2
02-01-2010, 05:36 PM
The RNA World server migration is completed. :D
if everything runs smoothly, we will start to allow more and more people to participate in a step-wise manner -
just as a precaution to avoid server overload.

Michael.

I guess that leave me :hiya: out for a bit :mad:
I have been known void warranty's :lmao:

Michael H.W. Weber
02-06-2010, 04:35 AM
Although we are still in testing mode and currently have few if any work units since we have to perform data analysis of our previous test runs, we have transiently opened RNA World registration without invitation code. I just wanted to let you guys know... ;)

Michael.

Bigred
02-06-2010, 04:12 PM
I'm in. :Pokes:

LAURENU2
02-06-2010, 08:54 PM
Although we are still in testing mode and currently have few if any work units . I just wanted to let you guys know... ;)

Michael.

2/6/2010 7:49:56 PM RNA World Message from server: (Project has no jobs available) :cry:
I put 1 node on to see how it works :evil:

Bigred
02-08-2010, 02:51 PM
2/6/2010 7:49:56 PM RNA World Message from server: (Project has no jobs available) :cry:
I put 1 node on to see how it works :evil:

I've got work!:D
:Pokes::Pokes::Pokes:

Death
02-09-2010, 08:06 AM
scores are so low. 2 points over 2 hours. dis suxx

Michael H.W. Weber
02-09-2010, 09:17 AM
scores are so low. 2 points over 2 hours. dis suxx
Ehm, using a ZX81 or what? :D No, scores are clearly in the normal range. ;)

Michael.

LAURENU2
02-09-2010, 12:07 PM
I've got work!:D
:Pokes::Pokes::Pokes:
Ya but it only lasted a few Hrs and there out of work Again:bang:
We Want Work :guntotin: We Want Work :guntotin: We Want Work :guntotin:

Bigred
02-09-2010, 03:00 PM
Ya but it only lasted a few Hrs and there out of work Again:bang:
We Want Work :guntotin: We Want Work :guntotin: We Want Work :guntotin:

Website says more coming tomorrow. We'll just have to see how long it lasts.
:lawn:

gopher_yarrowzoo
02-09-2010, 05:36 PM
Ya but it only lasted a few Hrs and there out of work Again:bang:
We Want Work :guntotin: We Want Work :guntotin: We Want Work :guntotin:

Maybe you need to erm just put like 1 node on it and at a low ratio :P

LAURENU2
02-09-2010, 09:37 PM
Maybe you need to erm just put like 1 node on it and at a low ratio :P
1 Node mmmm I don't think so I have a Fear of being run-over :train:

gopher_yarrowzoo
02-10-2010, 09:56 AM
Okay less than 50 nodes then :P

Michael H.W. Weber
02-10-2010, 10:41 AM
Well above 45,000 work units are in the process of being released today. :D

Michael.

Bigred
02-10-2010, 02:55 PM
Well above 45,000 work units are in the process of being released today. :D

Michael.

You better get a bunch more ready. Server status as of now only shows 1,100 ready to send.:(

gopher_yarrowzoo
02-10-2010, 05:23 PM
That's cause lauren2u beat ya to the punch :P

LAURENU2
02-10-2010, 06:42 PM
That's cause lauren2u beat ya to the punch :P

Yep I spent a Hr here at first light waking up all the sleeping nodes :whip:
And Getting then work to do
All the little Nodes were up late getting all the :roadkill: signs painted

If they got more work I will go to full power :moto: Tonight
And / Or we can shake out any bugs in the system :guntotin:

I see what you Mean I am outputting more then any Team is :D

gopher_yarrowzoo
02-11-2010, 06:50 AM
#1 User - Lauren2u
#1 Team - Free-DC
:rotfl:

Bok
02-11-2010, 08:43 AM
I think you'll beat me to 100K....:)

LAURENU2
02-11-2010, 09:05 AM
I think you'll beat me to 100K....:)
:beep: Keep to the Right OR :train:

LAURENU2
02-11-2010, 09:36 AM
Results ready to send 115
Results in progress 13,949

I think I sucked them dry :lmao:

zombie67
02-11-2010, 10:25 AM
When a project has (say) 100,000 tasks to send out, they don't usually make all 100k available at once. They set the ready-to-send queue at (say) 1k, and then refill it as needed, as tasks are sent out. So it may look like a project is almost out of work because the queue is so small. But not necessarily so.

Bigred
02-11-2010, 03:25 PM
It's out of work again.:( At this rate I'll never catch up. I guess it a good thing that they don't have a GPU app.:Pokes:

LAURENU2
02-11-2010, 06:40 PM
When a project has (say) 100,000 tasks to send out, they don't usually make all 100k available at once. They set the ready-to-send queue at (say) 1k, and then refill it as needed, as tasks are sent out. So it may look like a project is almost out of work because the queue is so small. But not necessarily so.

Like I said

I think I sucked them dry :lmao:
I'm Out Of Work To
And I have a Lot of Hungry Nodes to Feed :eat::eat::eat::eat::eat::eat:
And you all know how :firedevil:mean and :hair:grumpy Nodes get when there Hungry

If I don't get Something soon for them to crunch on, I will have to lock :cage:then all up for my own safety

Michael H.W. Weber
02-12-2010, 03:39 AM
Guys, just in short today (I am actually on a conference since yesterday where I advertise a bit for RNA World - there is HUGE interest): New work is on the sever. Indeed it starts processing new WUs only when a certain lower threshold of ready-made WUs available for send-out has been reached.
On the weekend, I will release our full set of CMCALIBRATE work - but this time only for Linux boxes running on a 64 bit basis. Reason: The huge computational demands in RAM and run time. In parallel, I will release more CMSEARCH stuff for those that do not have these monster machines. :D
But keep in mind: we are still in testing and that is why we sometimes experience WU shortage.

Michael.

Bigred
02-12-2010, 05:22 AM
On the weekend, I will release our full set of CMCALIBRATE work - but this time only for Linux boxes running on a 64 bit basis. Reason: The huge computational demands in RAM and run time. In parallel, I will release more CMSEARCH stuff for those that do not have these monster machines. :D
But keep in mind: we are still in testing and that is why we sometimes experience WU shortage.

Michael.

Why don't you create a GPU application to make very short work of these Monsters.:cool: It could be either for Nvidia or ATI cards. If you do this and need testing, let me know. I have both types. :guntotin:

Now let me see if I can get my boxes full before Laurenu2 drains them all.:Pokes:

Michael H.W. Weber
02-12-2010, 08:54 AM
Why don't you create a GPU application to make very short work of these Monsters.:cool: It could be either for Nvidia or ATI cards. If you do this and need testing, let me know. I have both types.
It has already been done. ;) Disappointingly, GPU processing is not applicable to our computational problem (as stated in our FAQ (http://www.rechenkraft.net/wiki/index.php?title=RNA_World/FAQ/en#Does_RNA_World_support_GPU.2FCUDA.2FSTREAM_processing.3F)).

Michael.

Digital Parasite
02-12-2010, 09:47 AM
On the weekend, I will release our full set of CMCALIBRATE work - but this time only for Linux boxes running on a 64 bit basis. Reason: The huge computational demands in RAM and run time.

Wouldn't 64bit Windows systems also be able to handle the but RAM and computational demands? My Win7 64bit system has no problem processing 6GB matrix multiplication running for weeks.

Jeff.

Michael H.W. Weber
02-12-2010, 10:35 AM
Wouldn't 64bit Windows systems also be able to handle the but RAM and computational demands?
Yes, of course they would. But for other reasons, this time we need to specifically limit the run to Linux OS. I am thinking of sending in another series specifically for the 64 bit Windows machines later. We have to do some comparative analyses, that's why. :D
It is actually quite impressive what throughput we achieve at present with having opened registration only a few days ago.

Michael.

LAURENU2
02-12-2010, 08:14 PM
Well Michael your server did OK It took My beating :gangpunchand kept on ticking
And Michael Please keep to the Right or I will :train: you on/about tomorrow

I Have only ported 90% of my power to your project Due to Nodes feeding on other projects
But that may change if you can provide Food for my Little hoard of Nodes :Cage:

Michael H.W. Weber
02-13-2010, 05:33 PM
And Michael Please keep to the Right or I will :train: you on/about tomorrow
Well, I hope you do not mind that I chose to keep to the left instead - just to avoid this one: :train:.
:D

Michael.

outlnder
02-13-2010, 06:29 PM
Isn't the rule of the Autobahn to "Drive Right"? :jester:

Michael H.W. Weber
02-14-2010, 07:55 AM
Isn't the rule of the Autobahn to "Drive Right"? :jester:
Indeed. That's why I chose left.
:metal:

Michael.

LAURENU2
02-14-2010, 04:10 PM
They say Left I say Right Some one is sure to be :train:
Thank goodness there are only 2 more to go :guntotin:


Have only ported 90% of my power to your project Due to Nodes feeding on other projects
But that may change if you can provide Food for my Little hoard of Nodes :Cage:
Up to about 95% NOW and I see some nodes searching for food
We need more Food Michael :Pokes:

I see you dishing out some fast :eat: food there 1 to 10 min WU's
How is your Server doing under the high load
Is that why some of my Nodes are looking for WU's

Michael H.W. Weber
02-14-2010, 05:19 PM
Well, as stated on the project main page in the NEWS section we were fiddling around a bit with the database. It seems, however, that to our surprise there is rather a bottle neck issue with the dual HD RAID system. We are on it but for the non-Linux x64 guys it might be advisable to add a second project just in case the well runs dry overnight (although there is CMS work queued up already). :rolleyes:

Michael.

outlnder
02-14-2010, 09:41 PM
Mr. Weber, even though I am currently not doing your project, I am very pleased that you are communicative and willing to do whatever you have to do to keep it up and running.

I assure you that your attitude is few and far between.

Thank you for a job well done.

Bigred
02-15-2010, 06:50 AM
I agree. It is almost unheard of for a project to be informative any place except the project web site. Keep up the great work.:thumbs:
And make sure you enjoy your Rosenmontag.:jester:

Michael H.W. Weber
02-15-2010, 07:00 AM
Thanks guys for the nice encouragement. :thumbs: Even on Rosenmontag (how do you know?) we keep an eagle eye on our project as you can see. :D

Michael.

Bigred
02-15-2010, 10:27 AM
(how do you know?)
Michael.

Take a look at my location. I'm at work in Mainz-Kastel and I can hear the parade across the river in Mainz.:drink:

Michael H.W. Weber
02-15-2010, 11:14 AM
Take a look at my location. I'm at work in Mainz-Kastel and I can hear the parade across the river in Mainz.:drink:
Ahh, that is indeed an explanation. :D

Back to RNA World now: As soon as the current CMS WUs are over, we most likely need to wait until completion of the current CMC WUs before I can put more CMS WUs online. I will put this information as well on the main page notice board, soon.

Michael.

Angus
02-15-2010, 01:54 PM
It looks like there about 7200 CMC WUs queued up (un-sent).

Since those can only be crunched on Linux 64 bit machines, and they each take 1.5 hours or so (from your server status graphs), it looks like it might be a couple of weeks before those all get crunched, since there don't seem to be very many of the required machines.

Is that correct?

Michael H.W. Weber
02-15-2010, 04:19 PM
Yeah. So the best way to solve the issue would be to hook up more Linux x64 machines, right? :D
I could of course produce more CMS WUs by running CMC locally as I did before on my 955 BE. We will see how things develop.
But we need people to understand, that in the present development stage of the project, it is of importance to make sure our system really runs well. We are still in testing phase. Please never forget that. You saw how only recently our quite powerful new server was hit by the tremendous requests. Or the issue with the RAID system. Good things sometimes take time to optimize for the many unexpected small issues that surely arise with such a project and for that we just request a little patience.

Michael.

[edit]: By the way, the server currently helps crunching away these CMC WUs for it is a Linux x64 machine, too. :D

gopher_yarrowzoo
02-15-2010, 06:53 PM
Micheal - just a thought you what FS you using on the Raid Ext2 or 3 if it's 3 you WILL see bottlenecking. We had that here wonder why rebuilt the server almost until we say EXT3 switched to EXT2 no journalling runs super fast

Michael H.W. Weber
02-16-2010, 05:51 AM
Well, using the CMS WU shortage period for server analysis and tweaking within a day we have already modified a number of things and we think it works well now. For example we run the WU generation as RAM disk avoiding HD load which dramatically speeds up things and resolves the HD bottleneck issue. Concerning the file system, I actually assumed it must be ext2 but I will make sure by talking to our sysadmin. Anyway, thanks for the hint. :thumbs:

Michael.

LAURENU2
02-17-2010, 02:45 AM
So how long do you think we (windows users) will be out of work :cry:
Just remember anything linux can do we can do better and more of it :poke:

Michael H.W. Weber
02-17-2010, 04:40 AM
It shouldn't take much longer. Approx. two thirds of the CMC WUs are already completed. Thereafter, a Windows session will start. ;)

Michael.

Bok
02-17-2010, 09:27 AM
Hi Michael,

I have some cmc units on a linux 64bit which are at 90hrs (82%) and 50hrs (43%) is this ok? Most others on the same machine have ran in a few hours at most..

Bok

Michael H.W. Weber
02-17-2010, 10:59 AM
Hi Michael,

I have some cmc units on a linux 64bit which are at 90hrs (82%) and 50hrs (43%) is this ok? Most others on the same machine have ran in a few hours at most..
Yes, that is completely normal. In fact, these we call "in the monster WU range". :D I have one now at 90 hrs with 35% done on my AMD 955 BE Quad-Core. That one is the second largest we have in the pipeline. But often, especially these long WUs suddenly finish, so their total run time indication more frequently is overestimated compared to the small CMC WUs. We also do not fully understand why this happens. So we just have to accept it as a fact at present.

Michael.

Michael H.W. Weber
02-18-2010, 03:49 AM
Micheal - just a thought you what FS you using on the Raid Ext2 or 3 if it's 3 you WILL see bottlenecking. We had that here wonder why rebuilt the server almost until we say EXT3 switched to EXT2 no journalling runs super fast
Could you please specify a bit more in detail what exactly the problems were with EXT3? We figured that our server has EXT3 as well (to my surprise).

Michael.

Michael H.W. Weber
02-18-2010, 01:31 PM
Please do not be surprised if the RNA World server is unreachable for a couple of hours. We have to run a thorough HD hardware check to exclude damage with the file system (don't worry - everything was backed up in advance and is backed up anyway regularly).
I will also release additional CMS WUs soon (this week). But details will be found on the main page as usual in the NEWS section.

Michael.

gopher_yarrowzoo
02-18-2010, 03:17 PM
It to do with the Journals it does any FS transaction aka file read/write gets added in,
Bok will verify this one, we set up a Super fast new server and after very little time it slowed from 6Mbit transfer rates to Sub 1.5Mbit we pulled it apart re did it still the same redid the chipset drivers the works - see if you can't turn the Journalling off in EXT3..

Michael H.W. Weber
02-19-2010, 03:48 AM
New CMS work ahead some time this weekend. Concerning the EXT3 journalling issue thanks for the information. We will look into it. :thumbs:

Michael.

LAURENU2
02-19-2010, 09:12 AM
New CMS work ahead some time this weekend.. :thumbs:
Michael.

But but :hiya: I want it NOW .:eat: :eat: :eat:

Michael H.W. Weber
02-20-2010, 01:29 AM
I will release around 300.000 CMS WUs, just for a start. :D

Please stick to preferentially completing the Linux x64 CMC WUs by deactivating CMS if you have Linux x64 machines that still receive CMC WUs (some don't due to our very strict HR settings, so these can work on the upcoming CMS instead if you like) otherwise we will have a WU shortage again later on. The shorter deadline of CMS will make BOINC engage preferentially the new CMS WUs. That's why I ask to transiently deactivate CMS on the relevant boxes.

Michael.

LAURENU2
02-20-2010, 02:50 AM
I will release around 300.000 CMS WUs, just for a start. :D
Michael.
That will be a good for a breakfast :eat:
But what about Dinner :bar:

Bigred
02-20-2010, 04:10 PM
That will be a good for a breakfast :eat:
But what about Dinner :bar:

There might be a few left for an afternoon snack.:fridge:

LAURENU2
02-20-2010, 09:23 PM
There might be a few left for an afternoon snack.:fridge:
NOT IF I CAN HELP IT
First come first served My train is fueling up
:bigtrain::bigtrain::bigtrain:

LAURENU2
02-21-2010, 03:03 PM
I think I was to greedy I melted one of My DSL Modems with all the short WU's :mad:
Only have 1 DSL line working now so My totals will be low today:(
Off to FRY's I go to get a New DSL Modem :bang:

yoyo
02-21-2010, 03:52 PM
I think I was to greedy I melted one of My DSL Modems with all the short WU's :mad:
Only have 1 DSL line working now so My totals will be low today:(
Off to FRY's I go to get a New DSL Modem :bang:

So we are not only testing our server ;)
It was also a testt for your modem :guntotin:

Angus
02-21-2010, 05:48 PM
:hair:I'm getting consistent "no work available" messages again even though the server status page insists there are about 17000 WUs available. :firedevil:

They can't *ALL* be non-Windows tasks, can they?

LAURENU2
02-21-2010, 08:24 PM
So we are not only testing our server ;)
It was also a testt for your modem :guntotin:
Yes I guess so Life is one big test
I'm back on line now and see
2/21/2010 4:18:41 PM RNA World Message from server: cmcalibrate is not available for your type of computer.
2/21/2010 4:18:41 PM RNA World Message from server: (there was work but it was committed to other platforms)
On some of my Nodes :cage:

gopher_yarrowzoo
02-22-2010, 07:00 AM
I think I was to greedy I melted one of My DSL Modems with all the short WU's :mad:
Only have 1 DSL line working now so My totals will be low today:(
Off to FRY's I go to get a New DSL Modem :bang:

You sure it wasn't maybe a line spike ....
Oh and maybe need to invest in a cooling tray for your routers :P

LAURENU2
02-22-2010, 10:38 AM
You sure it wasn't maybe a line spike ....
Oh and maybe need to invest in a cooling tray for your routers :P
No line spike here home is filtered 3 times and nothing else blew out
All 4 of the gigabit routers are cool but DSL Modems does get warm :firedevil:under load

I push things to limit here at home 24/7 things burn out all the time.:hair:
I have even been know to take down projects I work on:blush:

gopher_yarrowzoo
02-22-2010, 06:32 PM
No line spike here home is filtered 3 times and nothing else blew out
All 4 of the gigabit routers are cool but DSL Modems does get warm :firedevil:under load

I push things to limit here at home 24/7 things burn out all the time.:hair:
I have even been know to take down projects I work on:blush:

The routers will be cool as they have fans in them - bet the DSL's don't I know mine don't...
You do take project out sometimes either by server overload, bandwidth saturation or you just plain suck it dry.

Michael H.W. Weber
02-23-2010, 12:39 AM
Well, I think we will need to install a filter on the server side that, based on the run time estimates, prevents such small WUs from being sent out to clients in the future. Although we have more than 200,000 WUs left, little is computed due to the extremely heavy MySQL load that is caused by the massive connections. So, another lesson learned, I would say. These small WUs in fact can easily run on some of the (mostly idle) server cores and will also finish within a few days. :D

Michael.

LAURENU2
02-23-2010, 09:55 AM
Might last a few days if we could get them out of your server Michael
All I seem to get is like 1 WU maybe every 10 Min And thats for a quad :mad:
Why make the nods Wait 10 min to return for a WU that runs for a Min or less
Other projects alow WU's to be sent when there done and not wait the 10 min
like RNA does:Pokes:

zombie67
02-23-2010, 10:14 AM
Might last a few days if we could get them out of your server Michael
All I seem to get is like 1 WU maybe every 10 Min And thats for a quad :mad:
Why make the nods Wait 10 min to return for a WU that runs for a Min or less
Other projects alow WU's to be sent when there done and not wait the 10 min
like RNA does:Pokes:

The server is already overwhelmed. Allowing a shorter back-off time would only make that worse.

But by not sending out the very short tasks at all, that should really improve the load on the server. And that should allow for shorter back-off times.

yoyo
02-23-2010, 10:18 AM
Correct! I increased the time between requests to 20 minutes :)

LAURENU2
02-23-2010, 10:25 AM
Correct! I increased the time between requests to 20 minutes :)

I guess I will pull off this project then :hiya: 20 min is a True waste of my cycles
That should help with your sever load

Michael H.W. Weber
02-23-2010, 11:25 AM
Please just add another project in parallel. That will allow us to complete this set of WUs (which in fact is an extremely cool project, just read our scientific objectives (http://www.rechenkraft.net/wiki/index.php?title=RNA_World/Scientific_objectives/en#Project:_CRISPR) section on this series) and your machines won't go idle as well. In the meantime, the important Linux x64 CMC set will be completed (it is nearly done now) and thereafter we will have plenty of long WUs plus I will add the first analyses of the human genome which are currently in preparation. :D

Michael.

LAURENU2
02-23-2010, 05:05 PM
Please just add another project in parallel.
Michael.

RNA is one of 5 projects that I am running now at the same time
:Pokes:Look below at my sig

But I have noticed that RNA on some Nodes Runs like a weak puppy:fozzie:
and lets other projects eat up most of the power:fireboun:

I was Not going to Detach from RNA But rather pull back on the amount of the power ported to RNA
I got the feeling by the response time from your Server that it was being overwhelmed
Turning down my power would Fix it:D I can pump out a lot of bytes from my Garage :blush:

Angus
02-23-2010, 10:20 PM
The length of the WUs is ridiculous, and to increase the time between requests is an insult.

I watched 8 or 10 WU start to download last night. The first 6 were done and uploaded before the others even finished downloading.

There's something basically wrong with their design if they can't produce a WU that lasts for more than 30 seconds.

Michael - I looked at your reference in the wiki about the current project -

Project: CRISPR

* The CRISPR elements are part of a prokaryotic defence system directed against external attacks by e.g. viruses and may be viewed as a simple immune system of microorganisms.

By employing RNA World to systematically screen organisms for the presence of the various types of this defence machinery, we hope to acquire important information on the global distribution and varieties of this system. There is an enormous repertoire of potential applications to the results of such analyses ranging from the improvement of industrially relevant microbial food production to novel ways of coping with multi-drug resistant pathogenic bacteria.

How is this supposed to explain why the WUs have to be so short???? Can you explain this in humanly understandable English?

zombie67
02-24-2010, 10:42 AM
They already explained it, and a fix is in the works. Did you not read the post where they said that they are going to apply a filter to avoid sending out the very short tasks?

Michael H.W. Weber
02-24-2010, 11:48 AM
How is this supposed to explain why the WUs have to be so short???? Can you explain this in humanly understandable English?
Yes, it simply is a result of the low complexity of this WU series. Moreover, with these analyses it is pretty hard to say which WU will take a lot of time and which not unless you run a test simulation on each of these and even then the true results will vary from expectation.

Michael.

Angus
02-24-2010, 03:40 PM
They already explained it, and a fix is in the works. Did you not read the post where they said that they are going to apply a filter to avoid sending out the very short tasks?

Not trying to be pissy here :

I don't see that they explained WHY the WUs were so short, nor is there any resolution in place at this time.

I see Michael has sort of explained it now as "low complexity".

I find it interesting that they didn't know up front that this batch of WUs would be so short. I would think that they would run some sort of basic sanity check on the WUs they were generating to see if they even worked before releasing them to the wild, and that would have revealed the short execution time problem.

gopher_yarrowzoo
02-24-2010, 06:38 PM
Angus "Not trying to get pissy here...." Oh really I think you are again.
Now this project is still in TEST phase and thus short / long WU's will occur they need to test server load and balancing..
What has WU Length got to do with ANYTHING anyway apart from wasting time waiting on the next one...
Like Micheal has suggested paralleling it with other stuff, so maybe you should do just that and relax a little!

LAURENU2
02-24-2010, 06:55 PM
You got to remember This is still a Baby Project :eat:
It is just stating to :walking: WU-lk on it's own:lmao:
You have to expect a stumble or two before it can run fast and smooth

Angus
02-24-2010, 07:24 PM
Angus "Not trying to get pissy here...." Oh really I think you are again.
Now this project is still in TEST phase and thus short / long WU's will occur they need to test server load and balancing..
What has WU Length got to do with ANYTHING anyway apart from wasting time waiting on the next one...
Like Micheal has suggested paralleling it with other stuff, so maybe you should do just that and relax a little!

I was trying to make a response to zombie67's snarky comment about not reading another post that was unrelated to my question about WHY the WUs were so short.

As for being a "test" or "alpha" or "beta" project, that would be most BOINC projects.
Only 8 have identified themselves as "Production" projects. Project status (http://boincstats.com/page/project_status.php)


Also, both Michael and Yoyo have been around BOINC long enough to understand that releasing 300,000 very tiny WUs will swamp their server. Anyone who has tested with 'uppercase' exposed to the public has seen that behavior.

So, indeed, WU length has everything to do with how busy the server and database will be.

zombie67
02-24-2010, 08:37 PM
I was trying to make a response to zombie67's snarky comment about not reading another post that was unrelated to my question about WHY the WUs were so short.

My post had nothing to do with your "why" question. It was purely a response to your bitching about the short WUs. You acted surprised and "insulted" that you were getting short WUs. If you had read the previous posts before posting, you would have known that they were aware already of the problem and working on implementing a solution.

gopher_yarrowzoo
02-25-2010, 06:21 AM
My post had nothing to do with your "why" question. It was purely a response to your bitching about the short WUs. You acted surprised and "insulted" that you were getting short WUs. If you had read the previous posts before posting, you would have known that they were aware already of the problem and working on implementing a solution.

Zombie, IMHO sometimes is not worth trying to explain stuff to people if you don't give them the answer they expect see it all the time in my line of work...

Reminds me I really should pop over to your forum and say hi again :rotfl:

Michael H.W. Weber
02-25-2010, 08:43 AM
So, indeed, WU length has everything to do with how busy the server and database will be.
Indeed. But it is a big difference between just knowing this and measuring the exact impact on your own machine in practice. And that's just what we did.

Michael.

LAURENU2
02-26-2010, 02:29 AM
Well I hope you got the the exact impact on your own machine in practice :Pokes:I know I did :hair::hair::hair:
I never did see so many blinking lights before except for Jeff's X Mass display :lmao:
By the way he is better at it then you Jeff's lights (http://www.free-dc.org/forum/showpost.php?p=117223&postcount=1):lmao:

Bigred
02-26-2010, 05:20 AM
I'm glad somebody is getting some work because I sure ain't.:(

LAURENU2
02-26-2010, 09:02 AM
I'm glad somebody is getting some work because I sure ain't.:(
Like I said

But I have noticed that RNA on some Nodes Runs like a weak puppy
and lets other projects eat up most of the power

When run with other projects you sometimes have to make the client get work by kicking the Update button
RNA seems to want to go to sleep if there is no work sent for a bit

Michael H.W. Weber
02-27-2010, 04:40 AM
RNA seems to want to go to sleep if there is no work sent for a bit
The project should actually check every 20 minutes for work.

Maybe you have noticed that the average CMS run times now slightly increase. My 955 BE does not spend only a few seconds on the WUs but up to 15 minutes. This should persist for the remaining WUs that are still in the pipeline.
Concerning the two Linux x64 WU packets, around 35 and 63 very long WUs remain to be completed. After that we will have plenty of work to do at all ends. ;)

Michael.

LAURENU2
02-27-2010, 01:06 PM
Your project is still overwhelming your server and my network
I never did see so many blinking lights before you might think it was Las Vegas:lmao:

Michael H.W. Weber
02-28-2010, 07:51 AM
I never did see so many blinking lights before you might think it was Las Vegas:lmao:
:D The current set of CMS WUs should be complete, soon. Still, some Linux x64 CMCs remain in the pipeline. And these are monsters... So, beware... :D

Michael.

Bigred
03-01-2010, 07:45 AM
:idea:Maybe someone should turn the cms_validator back on. There are only about 42,000 workunits waiting on it.:Pokes:

Bok
03-01-2010, 01:31 PM
Hi Michael,

I still have two monster wu's running on a 64bit linux box. They are both at 100hrs now and ~ 90% complete, but their deadlines were over this past weekend. I'll let them finish, just curious as to whether credt will still be applied?

Bok :cheers:

Angus
03-01-2010, 05:42 PM
:idea:Maybe someone should turn the cms_validator back on. There are only about 42,000 workunits waiting on it.:Pokes:

Since the validator isn't running, and credits are pending, does this impact the WU being marked as "completed" ? And, if the WU isn't completed, are the crunchers processing all 3 of the initially issued results, instead of meeting the quorum with two valid returned results, causing a 50% increase in server load to download and upload and process those extra results?

Edit: Never Mind. It appears that they have changed to initial replication setting to only 2. Good choice.

Michael H.W. Weber
03-01-2010, 06:32 PM
@Angus: The current settings are efficient in avoiding the MySQL attack, so server issues are not seen anymore. We now just wait to complete the CMS WU set and, as said before, after that we will bring all setting back to normal on the server side and this type of problem should never occur again as we have found an efficient measure to avoid this accumulation of short WUs.
Concerning the the WU redundancy, yes, with this set we have more results to validate (and to credit of course) than usually, because as long as the validator is halted, no "WU already complete" signal will be sent to the clients to abort non-started computation efforts. This is clearly not so nice because it causes overhead in computational efforts, but the WUs are very tiny, so we think it should ot bother us too much. Just consider other projects that do not even make use of this possibility to reduce "WU redundancy".

@All: We seem to have identified a yet unknown problem with some of the long running CMC WUs. First, it seems they run about double the time which is indicated initially by the progress bar. This is unavoidable due to the type of computations we carry out and the unreliability of the progress bar in case of CMC WUs increases with their complexity (i.e. duration). So on the basis of the progress indicator, just do not make the decision to abort a WU. If possible, keep it running, please. The real baffling thing is that some of the long WUs are judged as "client error". Strangely, they seem to be intact (at present an assumption). We are currently checking that more in detail and we need more data to identify the source of the problem. So, please help us a bit in finding out what goes wrong by not giving up to run these WUs such that we can analyze the outcome one by one.

Michael.

P.S.: I am surprised that we encounter such an issue at this stage since I was quite convinced things work fine. But, of course, we focused on the smaller CMC WUS to test a lot in a short period of time. Now we released the "big birds" after all that testing and here we need to do some homework again. I have my 955 BE on these long WUs, so at least we are sitting in the same boat and that boat I like. :D

LAURENU2
03-01-2010, 08:40 PM
The current settings are efficient in avoiding the MySQL attack, so server issues are not seen anymore. :D
Now that I have Pulled back to 40% I do see your sever a bit more responsive
What are your plans to handle a increas in power input to RNA:Pokes:

Angus
03-01-2010, 09:29 PM
Michael

Can you explain what is being discussed at the bottom of this thread?

http://www.rechenkraft.net/phpBB/viewtopic.php?f=76&t=10706

It's all in German, and the translation services don't speak well enough German to any sense of it make.

Saenger
03-02-2010, 12:55 AM
Michael

Can you explain what is being discussed at the bottom of this thread?

http://www.rechenkraft.net/phpBB/viewtopic.php?f=76&t=10706

It's all in German, and the translation services don't speak well enough German to any sense of it make.

It's about a WU of me that once was crashed by my computer after 125h as the puter crashed, and was finished, but declared erroneous after 144h of the second run.
So far 3 others have finished, all with the same error as well, and all seemingly with the same output file uploaded. If it wouldn't be for the computational error, they would probably be declared valid.

Ananas and Michael are looking into that issue, what had happened, how to avoid it in the future and probably what to do in this concrete case.

As I'm no programmer nor biology scientist, "just" a mechanical engineer, I can't really say more useful stuff ;)

Michael H.W. Weber
03-02-2010, 04:08 AM
We are on track of the issue. I figured that for yet unknown reason the client does not write the result data into the output file. Instead, it uploads the input file as result which is logical since the input file is overwritten with the same file name as soon as processing is complete. What puzzles us is the fact that this problem is occurring only with a subset of WUs. Since these run perfectly on my local machine, I suspect a problem with the wrapper. But we do not know for sure at present. The logs tell us that within the last step there is a break in processing. If you have ideas, just post them here. It might help us! :thumbs:

Michael.

Bok
03-03-2010, 08:35 AM
One of my monster jobs finished and is marked as 'too late to validate'...

http://www.rnaworld.de/rnaworld/results.php?hostid=374&offset=0&show_names=0&state=4

140 hours wasted? The other one is at 142hrs right now and 93% complete. I'm not too bothered about points, just hate to see any result go unused..

Bok

LAURENU2
03-03-2010, 09:21 AM
One of my monster jobs finished and is marked as 'too late to validate'...

http://www.rnaworld.de/rnaworld/results.php?hostid=374&offset=0&show_names=0&state=4

140 hours wasted? The other one is at 142hrs right now and 93% complete. I'm not too bothered about points, just hate to see any result go unused..

Bok

But The Claimed credit of 1,150.47 points sure would help in your Quest of World Domination:lmao:

Michael H.W. Weber
03-03-2010, 01:47 PM
One of my monster jobs finished and is marked as 'too late to validate'...

http://www.rnaworld.de/rnaworld/results.php?hostid=374&offset=0&show_names=0&state=4

140 hours wasted? The other one is at 142hrs right now and 93% complete. I'm not too bothered about points, just hate to see any result go unused..

Bok
I will see whether this WU can be manually re-entered for validation by Yoyo to see whether it was computed OK or still shows an error. For now, please run on your Linux box the following command if your WU has reached more than 95%:


strace -p <pid of cmcalibrate> | tail -1000000000 > strace.out

This will create a trace file with a maximum size of 1GByte. This might be useful to find the error.

Michael.

Michael H.W. Weber
03-04-2010, 05:34 AM
And please also try this one and post the output here:


ulimit -a
Thanks! :D

Michael.

P.S.: Except for 4 WUs the CMCs are now completed. We tweaked a bit again on our HR settings and by doing so improved the project's throughput.

Bok
03-04-2010, 08:24 AM
[root@dual275 ~]# strace -p 3551 | tail -10000000 > strace.out
Process 3551 attached - interrupt to quit
times({tms_utime=52889500, tms_stime=2708, tms_cutime=0, tms_cstime=0}) = 542291 108
lseek(7, 0, SEEK_SET) = 0
write(7, "0.939167", 8) = 8
times({tms_utime=53009664, tms_stime=2711, tms_cutime=0, tms_cstime=0}) = 542411305
lseek(7, 0, SEEK_SET) = 0
write(7, "0.940000", 8) = 8
times({tms_utime=53130244, tms_stime=2715, tms_cutime=0, tms_cstime=0}) = 542531933
lseek(7, 0, SEEK_SET) = 0
write(7, "0.940833", 8) = 8
times({tms_utime=53250881, tms_stime=2719, tms_cutime=0, tms_cstime=0}) = 542652595
lseek(7, 0, SEEK_SET) = 0
write(7, "0.941667", 8) = 8
times({tms_utime=53371605, tms_stime=2724, tms_cutime=0, tms_cstime=0}) = 542773351
lseek(7, 0, SEEK_SET) = 0
write(7, "0.942500", 8) = 8
times({tms_utime=53491730, tms_stime=2726, tms_cutime=0, tms_cstime=0}) = 542893510
lseek(7, 0, SEEK_SET) = 0
write(7, "0.943333", 8) = 8
times({tms_utime=53612159, tms_stime=2729, tms_cutime=0, tms_cstime=0}) = 543013963
lseek(7, 0, SEEK_SET) = 0
write(7, "0.944167", 8) = 8
times({tms_utime=53732545, tms_stime=2732, tms_cutime=0, tms_cstime=0}) = 543134373
lseek(7, 0, SEEK_SET) = 0
write(7, "0.945000", 8) = 8
times({tms_utime=53852178, tms_stime=2735, tms_cutime=0, tms_cstime=0}) = 543254039
lseek(7, 0, SEEK_SET) = 0
write(7, "0.945833", 8) = 8
times({tms_utime=53972575, tms_stime=2738, tms_cutime=0, tms_cstime=0}) = 543374466
lseek(7, 0, SEEK_SET) = 0
write(7, "0.946667", 8) = 8
etc etc

[root@dual275 1]# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
max nice (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 16383
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
max rt priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 16383
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited


This one will finish today at some point, though again the deadline was the 27th. It's at 165hrs right now.

yoyo
03-04-2010, 10:17 AM
Hi Bok,
the strace is important at the end of the wu if it fails.
yoyo

Bok
03-04-2010, 10:59 AM
no problem, strace is still running :)

wu is at 99.352% after 168hours.

LAURENU2
03-04-2010, 02:33 PM
Michael

Results ready to send 1 :Pokes: we need more work

Results in progress 1,585 :thumbs:

Workunits waiting for validation 13,296:idea: are these all mine :D

Angus
03-04-2010, 03:40 PM
I see over 40,000 results (tasks?) still waiting for validation, but that's about half of what it was yesterday, so the validator is crunching its way through the pile of work you gave it.

yoyo
03-04-2010, 03:53 PM
no problem, strace is still running :)

wu is at 99.352% after 168hours.

I think this one finished and validated.
yoyo

Bok
03-04-2010, 06:07 PM
yup. Just noticed myself. Here is the end of the strace if you are interested.


times({tms_utime=61225425, tms_stime=3868, tms_cutime=0, tms_cstime=0}) = 550682189
lseek(7, 0, SEEK_SET) = 0
write(7, "0.999167", 8) = 8
time(NULL) = 1267728664
times({tms_utime=61339827, tms_stime=3902, tms_cutime=0, tms_cstime=0}) = 550797050
time(NULL) = 1267728664
times({tms_utime=61339827, tms_stime=3902, tms_cutime=0, tms_cstime=0}) = 550797050
time(NULL) = 1267728680
times({tms_utime=61341235, tms_stime=3912, tms_cutime=0, tms_cstime=0}) = 550798650
write(1, " 48:43:39\n filter - lo"..., 67) = 67
time(NULL) = 1267728680
times({tms_utime=61341235, tms_stime=3912, tms_cutime=0, tms_cstime=0}) = 550798652

mremap(0x2aaaaf386000, 292360192, 511516672, MREMAP_MAYMOVE) = 0x2aaaaf386000
time(NULL) = 1267736446
times({tms_utime=62065739, tms_stime=4064, tms_cutime=0, tms_cstime=0}) = 551575236
write(1, " 02:09:26\n# -------- --- --"..., 1399) = 1399
munmap(0x2aaaaf386000, 511516672) = 0
write(1, "//\n", 3) = 3
read(8, "", 4096) = 0
time(NULL) = 1267736446
times({tms_utime=62065739, tms_stime=4071, tms_cutime=0, tms_cstime=0}) = 551575245
lseek(8, 0, SEEK_SET) = 0
open("cmfile.xxx", O_RDONLY) = -1 ENOENT (No such file or directory)
open("cmfile.xxx", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 9
read(8, "INFERNAL-1 [1.0.2]\nNAME Intr"..., 4096) = 4096
read(8, ".497 -5.274 -4.934 0.000 0.0"..., 4096) = 4096
read(8, "10 -1.005 -6.446 -3.975 0.66"..., 4096) = 4096
read(8, " 104 103 3 106 3 -10.1"..., 4096) = 4096
read(8, "MP 141 140 1 145 6 -10"..., 4096) = 4096
read(8, " D 175 173 3 176 5"..., 4096) = 4096
read(8, " \n IR 212 212"..., 4096) = 4096
read(8, "18 -0.006 -9.594 -9.874 -10.2"..., 4096) = 4096
read(8, ".164 0."..., 4096) = 4096
read(8, " 4 -9.716 -9.923 -0.008 -8.3"..., 4096) = 4096
read(8, " 352 6 355 3 -10.745 -0.0"..., 4096) = 4096
read(8, "MATP 103 ]\n MP 389 388 6"..., 4096) = 4096
read(8, " -3.908 0.660 -0.612 -0.293 -0"..., 4096) = 4096
read(8, " 0.000 0.000 "..., 4096) = 4096
read(8, " -5.695 -0.829 -3.908 0.660 "..., 4096) = 4096
read(8, ".906 0."..., 4096) = 4096
read(8, "]\n ML 568 567 2 570 "..., 4096) = 4096
read(8, " 0.000 0.000"..., 4096) = 4096
read(8, " 6 644 6 -6.988 -5.717 "..., 4096) = 3245
fstat(9, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaaaaad000
write(9, "INFERNAL-1 [1.0.2]\nNAME Intr"..., 4096) = 4096
write(9, " -9.399 "..., 4096) = 4096
write(9, " 1.652 -2.459 0.633 0.114 -1."..., 4096) = 4096
write(9, " 0.000 0.000 0.000 0."..., 4096) = 4096
write(9, " 6 -7.451 -7.797 -2.512 -"..., 4096) = 4096
write(9, " 151 6 -6.988 -5.717 -1."..., 4096) = 4096
write(9, ".062 0.095 -0.746 -2.041 -2.246"..., 4096) = 4096
write(9, " 0.000 0.000 0.000 0.000 \n\t\t\t"..., 4096) = 4096
write(9, "5 -0.829 -3.908 0.660 -0.612 "..., 4096) = 4096
write(9, "920 -4.087 -5.193 0.0"..., 4096) = 4096
write(9, "3 -1.564 -1.458 -1.748 "..., 4096) = 4096
write(9, "\n D 363 361 3 364 "..., 4096) = 4096
write(9, "ATL 105 ]\n ML 398 397 3 "..., 4096) = 4096
write(9, "000 0.000 0.000 \n\t\t\t\t[ MATP 1"..., 4096) = 4096
write(9, " -0.001 "..., 4096) = 4096
write(9, " 0.000 0.000 0.000 0.00"..., 4096) = 4096
write(9, "649 -6.162 -0.021 "..., 4096) = 4096
write(9, "2 -3.397 -1.054 \n D 578 "..., 4096) = 4096
write(9, " 3 -10.691 -0.856 -1.162 "..., 4096) = 4096
close(8) = 0
munmap(0x2aaaaaaac000, 4096) = 0
write(9, "408 -0.496 -5.920 -4.087 -5."..., 2190) = 2190
close(9) = 0
munmap(0x2aaaaaaad000, 4096) = 0
rt_sigprocmask(SIG_BLOCK, [INT], NULL, 8) = 0
unlink("cmfile") = 0
rename("cmfile.xxx", "cmfile") = 0
rt_sigprocmask(SIG_UNBLOCK, [INT], NULL, 8) = 0
write(1, "#\n# CPU time: 620657.39u 40.71s "..., 64) = 64
exit_group(0) = ?
Process 3551 detached
[root@dual275 ~]#

Michael H.W. Weber
03-05-2010, 06:41 AM
Ok, guys, fasten your seat belts: as noted in our NEWS section, the last two CMS work packets alone returned about 900,000 results with a total traffic of 1 TB. :D

The upcoming weekend will bring new work, I guess. ;)

Michael.

LAURENU2
03-05-2010, 03:10 PM
Ok, guys, fasten your seat belts: as noted in our NEWS section,
Michael.
With this new work can I go back to Full Power Without impacting your server like I did last time?:hair:
40% power is No Fun :geezer:

:Pokes:Have you Beefed up you DB to do Better then it did ?

Michael H.W. Weber
03-05-2010, 05:56 PM
Well, the upcoming CMC WUs, even those of the "monster type" should run smoothly now. Although they still do run long. :D At least we have extended the deadline for these by a whole week. ;) Concerning the upcoming CMS work, we will first send out a small batch to see how it is doing.

Michael.

LAURENU2
03-06-2010, 02:45 AM
So I guess that means hold at a 40% commit mark :(

Michael H.W. Weber
03-06-2010, 02:16 PM
New CMCs for Windows x64 have been fed into the server and will be delivered today. We expect these to run without complications unlike some WUs we delivered to Linux x64 a few weeks ago.
Note that the CMS packets coming up tomorrow will require broadband internet connection. Transfer volumes will be one to two orders of magnitude higher for some WUs compared what you are used to. Computation times will range between a few minutes to up to 100 hrs on a 955 BE (3 GHz). The human genome will also be analyzed. :thumbs:

Michael.

vaughan
03-07-2010, 05:01 AM
Computation times ... up to 100 hrs ...

Michael.

Count me out then. I detest long running molecules.

Michael H.W. Weber
03-07-2010, 12:33 PM
Maybe a few more details. This time we have a total of 1024 genome files that will be analyzed with two RNAs. Of these 1024 files,

14 are sized > 200 MB (only 3 of these 14 > 300 MB)
126 are sized > 100 MB
319 are sized < 1 MB

All given file sizes are in uncompressed format, i.e. the files actually delievered will be smaller by at least 30%. Run time distribution will approximately scale with to genome sizes but I do not have a detailed run time distribution table (an idea that came to my mind today, but which would cost considerably efforts to process, so I postponed this for a while). So, we picked out very large, intermediate and small genomes and tested these with a set of around 10 RNAs, amongst these of course also the two we use for today's test run. Maximum run time is below 100 hrs.

Michael.

[edit]: Some WUs seem to require well above 1 GB of RAM. Yet we have no means to determine the memory requirements prior to simply running the WU and checking things out in practice. So please report any issues, observed RAM requirements, etc. to me if possible such that we can act quickly if required (e.g. tweak the memory settings as we successfully did with the CMC WUs).

Angus
03-09-2010, 08:43 PM
I had to abort all the WUs and go NNW on this project.

The new CMS WUs use stupidly large amounts of RAM. My system used all the physical and page file space, and basically just stopped functioning. It's a pretty vanilla box, Win XP, 2GB RAM, and a dual core CPU. Probably representative of a high percentage of all the machines running BOINC.

I guess this will be a high end, 64 bit machines only project.

zombie67
03-09-2010, 10:03 PM
It would be nice is RNA had a page like this, showing requirements by app:

http://boinc.umiacs.umd.edu/apps.php

Michael H.W. Weber
03-10-2010, 08:11 AM
It would be nice is RNA had a page like this, showing requirements by app:

http://boinc.umiacs.umd.edu/apps.php
Thanks for the link, it indeed looks quite informative and worth to adapt to our needs as well. :thumbs:

One more thing: We have a number of CMS WUs that unexpectedly demand really immense amounts of RAM. Could you please keep an eye on whether crashes of such WUs occur exclusively on 32-bit machines? We already have a suspicion what might go on here.
Also, if you could provide RAM requirements observed for individual WUs on your systems, we may construct an improved WU assignment system such that the big ones are just not getting delivered anymore to machines that cannot handle them properly. As said before, we cannot yet forecast RAM consumption. So, annoyingly we have to go for tests and then adapt the project settings accordingly.
If all these measures do not help, we must consider splitting the WUs into smaller chunks but that will result in "downstream result puzzle" efforts which I'd prefer to avoid if possible.

Michael.

LAURENU2
03-10-2010, 09:18 AM
Out of all my nodes I only have 1 nodes that can get and do work on this project. :bang:
With that I will have to pull off this project till you can provide smaller bits of food for my Nodes to crunch:Pokes:

outlnder
03-10-2010, 05:11 PM
Unfortunate for Lauren, but great for the rest of us. We will finally get some work without a server crash. :guntotin::jester::clap:

LAURENU2
03-10-2010, 10:26 PM
Unfortunate for Lauren, but great for the rest of us. We will finally get some work without a server crash. :guntotin::jester::clap:
:idea:Well see Good things come to those who wait:thumbs:
But as Arnold said I Will Be Back:lmao:

Michael H.W. Weber
03-11-2010, 05:55 AM
I think we found a solution for the RAM issue. But this might take some time to implement. :rolleyes:

Michael.

LAURENU2
03-21-2010, 08:14 PM
I think we found a solution for the RAM issue. But this might take some time to implement. :rolleyes:

Michael.
So Michael :hiya:have You /RNA made any progress on the solution for the RAM issue And the No Checkpoint yet ?

zombie67
03-23-2010, 09:54 PM
Thanks for the link, it indeed looks quite informative and worth to adapt to our needs as well. :thumbs:

Any progress on this? I know it seems low priority. But the memory requirements confuse potential crunchers. Something like this would really help.

LAURENU2
03-25-2010, 09:18 PM
so michael :hiya:have you /rna made any progress on the solution for the ram issue and the no checkpoint yet ?
BUMP:Pokes:

Michael H.W. Weber
04-11-2010, 01:13 AM
Since I am still in India, just quickly two notes concerning our project from the server's NEWS page:

http://www.rnaworld.de/rnaworld/forum_thread.php?id=46
http://www.rnaworld.de/rnaworld/forum_thread.php?id=47

So, the WUs in process right now should be safe for (hopefully) any machine.

Michael.

LAURENU2
04-11-2010, 02:39 AM
Since I am still in India, just quickly two notes concerning our project from the server's NEWS page:

http://www.rnaworld.de/rnaworld/forum_thread.php?id=46
http://www.rnaworld.de/rnaworld/forum_thread.php?id=47

So, the WUs in process right now should be safe for (hopefully) any machine.

Michael.
Well since your in India I will speak loud NO IT IS NOT SAFE
RNA still have a 1000 WU's to burn off

http://www.rechenkraft.net/phpBB/viewtopic.php?p=117573#p117573

Michael H.W. Weber
04-11-2010, 04:15 AM
Well since your in India I will speak loud NO IT IS NOT SAFE
RNA still have a 1000 WU's to burn off

http://www.rechenkraft.net/phpBB/viewtopic.php?p=117573#p117573
As I wrote in the NEWS section (see links above), there may indeed be a few WUs around of the remaining batch. Their number, however, is definetely not 1000 but in the worst case scenario a few hundreds. Moreover, these will exclusively be sent out to machines with large RAM availability. We have improved the WU delivery settings and also added a failure counter that prevents a WU from being sent out again if it has not passed validation for a certain number of times. :thumbs: Finally, if you take a look at the server status page you will see that even in the worst case scenario, the fraction of WUs from the old batch will be significantly below 10% of what the server currently is sending out and should be zeroed out shortly. ;)

Michael.

rilian
04-11-2010, 01:37 PM
Project is quite safe.. Any failing WUs on 2GB-doublecore machine since 10 march

LAURENU2
04-12-2010, 01:10 AM
As I wrote in the NEWS section (see links above), there may indeed be a few WUs around of the remaining batch. Their number, however, is definetely not 1000
Michael.

Yes just after you posted the 3817.7 MB message stopped and I got WU"S:guntotin:
AS soon as I complete my other project WU's I will start to Suck you dry:kiss:
Do you think you(RNA) can keep up? :umm:

Michael H.W. Weber
04-12-2010, 01:20 AM
Yes just after you posted the 3817.7 MB message stopped and I got WU"S:guntotin:
:D


Do you think you(RNA) can keep up? :umm:
Well, at present a few hundred thousands of WUs are waiting on the server to be processed for delivey and many more can be generated. ;)

Michael.

outlnder
04-12-2010, 01:17 PM
Ha, Lauren, it might take you a couple of days.

So much for mister fast pants. :dance::dance::dance::dance::dance::dance::dance:

LAURENU2
04-12-2010, 05:11 PM
Ha, Lauren, it might take you a couple of days.

So much for mister fast pants. :dance::dance::dance::dance::dance::dance::dance:

I'm like a big train :bigtrain::bigtrain::bigtrain:
Slow to start and hard to stop :train::train::train:

Turning On or Off a project is fast once I sign up all the Nodes
It is the New projects that take time for me

I could turn on RNA on all nodes in about 60 Min or so :moon:

zombie67
04-12-2010, 07:39 PM
If you could get BAM to work, it would take 30 seconds... :thumbs:

LAURENU2
04-12-2010, 08:31 PM
If you could get BAM to work, it would take 30 seconds... :thumbs:
But then What Would I do for the next 59.5 Min :slap:
OH Well It Was A Thought

outlnder
04-12-2010, 10:01 PM
I like to administer my pharm manually also. It's so much fun to plug and unplug the monitor, mouse and keyboard.

LAURENU2
04-13-2010, 01:47 AM
I use KVM's for the home 1 keyboard and 1 mouse and VNC to the Garage

zombie67
04-13-2010, 11:16 AM
I use VNC for everything, even the boxes in the same room.

LAURENU2
04-13-2010, 06:22 PM
I use VNC for everything, even the boxes in the same room.

Yes, But I have 2 DSL Networks And I don't understand:confused: how to bridge them. :bang:
So I need a KVM to switch from 1 PC to the other to control both of my networks :trash:
I have a 8 port KVM with 6 PC's just in my DC cave alone :lmao: (big closet):mad:

LAURENU2
04-15-2010, 01:21 AM
Ha, Lauren, it might take you a couple of days.

So much for mister fast pants. :dance::dance::dance::dance::dance::dance::dance:

Well it looks like only a couple days will get me to 1st place now that :hiya:I am up to speed :moto:that is if Michael's team can keep my heard happy
Was that fast enough for you outlnder :evil:

outlnder
04-15-2010, 04:11 AM
Wait a couple of hours. You may break it yet!!

LAURENU2
04-15-2010, 09:19 PM
Wait a couple of hours. You may break it yet!!
Well thats why I am only at partial commit So not to bog down there network
I wonder about RNA database with all the small WU's taking up all the DP space

LAURENU2
04-16-2010, 04:45 PM
_Free-DC_
:guntotin: Takes 2 ND Place. :guntotin:

outlnder
04-17-2010, 12:56 AM
WOW, salty rain, or is that the gods crying??:allhail:

Michael H.W. Weber
04-28-2010, 03:54 AM
As described in the projects NEWS section in more detail, we had to transiently disable the WU generator.

Michael.

Michael H.W. Weber
04-30-2010, 08:25 AM
The WU generator is working again, the small WUs will now be processed on the server, except for very, very few which we will deliver to some remote machines. Please also note that the server response delay has been drastically reduced such that connection issues should be resolved by now as well. We have plenty of WUs, will not run out of work and hope for your further support which is very much appreciated.

Michael.

P.S.: Our new concept to resolve the massive RAM requirements of our "CMSEARCH Monster WUs" has resulted in a program which is currently in testing phase. So, even here I expect good progress in the upcoming weeks...

Michael H.W. Weber
05-13-2010, 07:37 AM
Not a single error-report in two weeks it now seems that our project is really running reliably and smoothly. Given the vast amounts of WUs in the server queue, we are actually in need of much more support. :D So, if you could spare some cycles it would be highly appreciated...

Michael.

LAURENU2
05-13-2010, 04:20 PM
Not a single error-report in two weeks
OK I will see if I can Brake your new toy:bonk:
:beep: I Just had my Bandwidth increased on both DSL lines here :guntotin:

LAURENU2
05-13-2010, 10:49 PM
Not a single error-report in two weeks
OK I will see if I can Brake your new toy:bonk:
:beep: I Just had my Bandwidth increased on both DSL lines here :guntotin:
Michael H.W. Weber I thought you FIXED the problem with LOOOONNNG WU's
I just aborted 6 WU's with 300 HRS of my time and the WU's still had 500 hrs to go
A WU that runs for 100 to 200 HRS is absurd Michael
Have you even put check point in Yet to save all that time ?

Michael H.W. Weber
05-16-2010, 11:37 AM
Michael H.W. Weber I thought you FIXED the problem with LOOOONNNG WU's
I just aborted 6 WU's with 300 HRS of my time and the WU's still had 500 hrs to go
A WU that runs for 100 to 200 HRS is absurd Michael
Have you even put check point in Yet to save all that time ?
Well, we do not have such WUs in the pipeline (except for Yoyo would be doing some testing). Hence, I assume that this could just be one of these BOINC-related run time mispredictions. Could you please let me know a few more details on these cancelled "WUs" if still available?

Michael.

P.S.: Test apps are disabled, right (we are currently working on the user job submission forms)?

LAURENU2
05-17-2010, 01:55 AM
I guess you could find them here http://www.rnaworld.de/rnaworld/results.php?userid=919&offset=0&show_names=0&state=5
I really do not know how to give you the info you are asking for.
All I know Is I Do not want to work on Long WU's
I see them I abort them And as long as I see them I wi9ll not port mor power here

And Michael you avoided the question of
Have you even put check point in Yet to save all that time ?
Have you fixed this Problem yet ?

Michael H.W. Weber
05-17-2010, 05:22 AM
I really do not know how to give you the info you are asking for.
If you cancelled them manually, then you will find their names in the message window of the BOINC manager which you could paste in here.


All I know Is I Do not want to work on Long WU's
I see them I abort them And as long as I see them I wi9ll not port mor power here
I understand. Since my last posting some weeks ago, the WUs are mainly around 1.5 hrs in size with only few exceptions and as I believe these can easily be handled even without checkpointing. More importantly, they do not consume these tremendous amounts of RAM as some earlier batches did.


And Michael you avoided the question of
Have you fixed this Problem yet ?
So far, checkpointing is there only for x86 Linux as before. As soon as this changes, we will of course announce it.

Michael.

LAURENU2
05-17-2010, 10:22 AM
Well Michael I just Lost another 250 Hrs of computer time with 8 more Looonng WU,s I aborted
My Error page on your sight has just doubled in size

http://www.rnaworld.de/rnaworld/results.php?userid=919&offset=0&show_names=0&state=5

I am sure you can fine the names of the WU's on the above Page

Having such long WU' floating around WITHOUT any save points is Not a good thing Michael
I am sorry to say that If I keep getting the few exceptions as you put it I will flick the switch on RNA Till you get it fixed

Michael H.W. Weber
05-18-2010, 02:05 PM
(1) The current run times, i.e. shortest, longest and average you can always look up at any time from our server status page [at present: 0.92 hrs average (0.03 min - 12.61 max)] and then decide whether or not to participate. However, I have seen many projects without any checkpointing and long running WUs while we at least offer checkpointing for x86 Linux. So, given the stability of our project I am currently quite happy with it but of course it always depends on how you like to use your machines. ;)

(2) As detailed before, we will implement checkpointing on the basis of a functional VM as soon as possible.

(3) I will consider implementing a run-time selector such that, if possible at all at the BOINC server software's end (that I will have to check), users can select how long the biggest WU is allowed to be. I think this might be a useful compromise at this stage. ;)

Michael.

yoyo
05-18-2010, 02:37 PM
Well Michael I just Lost another 250 Hrs of computer time with 8 more Looonng WU,s I aborted
My Error page on your sight has just doubled in size

http://www.rnaworld.de/rnaworld/results.php?userid=919&offset=0&show_names=0&state=5


Hi Laurenu2,
nobody has access to this url, only you as logged in user. Or I must hack me into your account, which I do not want. So please say at least one ID of an LONG wu.

yoyo

LAURENU2
05-18-2010, 05:43 PM
OK here is a screen shot of one page

http://www.free-dc.org/forum/attachment.php?attachmentid=1278&stc=1&d=1274217938

Most were aborted 50% or less a few befor they started

I'm sorry I have a hatred of Long WU's
I did one for UD that ran for about 2500 HRS 24/7 and it FAILED
I was So Pi$$ed off I promised never to do another

This is also being reported on the RNA forum
I thought it was also said you FIX or stooped the Long WU's in the Past
Why are they Back in the system again

Michael H.W. Weber
05-19-2010, 04:03 AM
I'm sorry I have a hatred of Long WU's
I did one for UD that ran for about 2500 HRS 24/7 and it FAILED
I was So Pi$$ed off I promised never to do another

This is also being reported on the RNA forum
Let me clarify a few things here: A long WU in RNA World does definetely NOT mean that you are at risk of not completing it without errors except of course you abort it manually (you shouldn't have done that because if you hadn't you would have seen that they finish and give credits). I hope the time of error reports as we had in the past is over now. At least, the malfunctional WUs are COMPLETELY out of the queue and, again, the RAM requirements are more than moderate. You need to note, however, that with Windows you will loose your current calculation results if you restart the system due to lack of checkpointing with this OS.


I thought it was also said you FIX or stooped the Long WU's in the Past
Why are they Back in the system again
As said above, they are not back in the system and the problematic ones have indeed been deleted from the work queue. Still, some WUs are long runners. But that really is a minority. You need to understand that RNA World unlike other projects where e.g. a fixed number of MD simulations are computed over and over again is a project with very heterogenous WUs. And this will remain. But, anyway, I think I gave enough of explanations concerning all this. It ultimately is up to you what you like to support and what not. ;)

Michael.

LAURENU2
05-19-2010, 09:43 AM
This statement may not be true


Let me clarify a few things here: A long WU in RNA World does definitely NOT mean that you are at risk of not completing it without errors except of course you abort it manually
Michael.


I saw WU's in my systems that listed 244hrs
What if it ran for 9 days Power fails start over
run for 7 more days again and bam another power outage
are you WU's OK to rum for 20 or 30 days ?

Or what about the little guy who does not work 24/7 and shuts down there PC overnight
it would be a endless loop for them with out save points until your projects aborts it
and then gives them a new Long Wu to wast there time on

I want to support your project But I think a WU that runs for more then 4 hrs with out even 1 save point is risky
And one that rum for 24 hrs I do Not want to do even if they have save points
I am sorry Michael I made a promise to myself 10 years ago Not to do Long WU's
Thats why I just aborted 600 to 700 hrs of my computer time
I hope this makes you understand how strongly I feel about this matter of Long WU's

gopher_yarrowzoo
05-19-2010, 10:09 AM
This statement may not be true
Or what about the little guy who does not work 24/7 and shuts down there PC overnight
it would be a endless loop for them with out save points until your projects aborts it
and then gives them a new Long Wu to wast there time on

I want to support your project But I think a WU that runs for more then 4 hrs with out even 1 save point is risky
And one that rum for 24 hrs I do Not want to do even if they have save points


QFA Lauren - I currently don't do RNA World I was thinking about it but I am a little guy who don't run BOINC 24/7 on my main PC and well my backup PC is getting used more as my Main PC needs a major overhaul now.. So I wouldn't want to have to abort days of work if there are no save points - I mean how can that work I run multiple projects on both machines what happens when they switch projects - same thing it starts from 0 again?
I need to run windows as I have software I use for work on my home pc so I can do remote reseting of the systems we got and to see where the problem lies.

yoyo
05-19-2010, 04:58 PM
I checked this 189,000s workunit http://www.rnaworld.de/rnaworld/workunit.php?wuid=1015952 from LAURENU2 list. You see on the workunit page that this one is estimated with 318,000s runtime on the reference system. There is already one result finished for this wu, which needed 206,000s.

I checked also this http://www.rnaworld.de/rnaworld/workunit.php?wuid=1015891, which you aborted after 111,000s. This is estimated with 402,000s (4d 15h) on the reference system and finished by one host after 117,000s.

Yes, they are long, but not 10 days.

Maybe your Boinc DCF (duration correction factor) is to high and therefore the remaining time is so high. This time anyway is mysterious calculated by Boinc client.

yoyo

LAURENU2
05-19-2010, 07:09 PM
Yes Yoyo I did abort 1 or 2 that were 90% complete and a run time of 50 & 95 hrs
It has to do with the promise I made to myself read above

The 10 days came from the est time of a WU listed here that I aborted
!0 days or 1 day the same looping thing can happen without Save points

What is holding you back from having SAVE POINTS
It Seems like most all other BOINC have them
You just tell it to save every 10 Min

I know nothing about a (duration correction factor) Boinc is a stock install with no tweaking

Michael H.W. Weber
05-20-2010, 04:46 AM
I checked this 189,000s workunit http://www.rnaworld.de/rnaworld/workunit.php?wuid=1015952 from LAURENU2 list. You see on the workunit page that this one is estimated with 318,000s runtime on the reference system. There is already one result finished for this wu, which needed 206,000s.

I checked also this http://www.rnaworld.de/rnaworld/workunit.php?wuid=1015891, which you aborted after 111,000s. This is estimated with 402,000s (4d 15h) on the reference system and finished by one host after 117,000s.

Yes, they are long, but not 10 days.
Well, I see. His machines really picked the biggest ones that remained in the system (and will remain in the future as well for analyses of other organisms). :D



What is holding you back from having SAVE POINTS
It Seems like most all other BOINC have them
You just tell it to save every 10 Min
No, you do not 'just tell it to save'. ;) If you want checkpoints (i.e. 'save points') then you need to write (or re-write) the entire code of your science application de novo. I have explained this many times (even in our FAQ). Most scientific applications that run on high performance clusters are not desiged to have checkpointing. With RNA World, we run such software and it is technically impossible to change the code to write the checkpoints: it would have to be re-written completely with a different design. Moreover, it is a waste of time to do this for all the implemented and the many, many upcoming software modules in the future. I know other DC projects have checkpoints, but these usually have one single science core client and the entire system was from the very beginning designed to run as a DC system - that is a completely different starting situation compared to what we are dealing with at RNA World. Hence, we thought about methods to develop a universal checkpointing, i.e. 'writing save points' at the system level to apply it to exactly these applications which do not allow for checkpoing generally - and these applications are undoubtedly the majority. ;) This poses a different problem, namely that you require admin rights to save the entire RAM content to disk (it would be as if you send your laptop to sleep mode). Would you grant admin rights to a DC project? No. See, and that's the problem. We now have decided to use a virtual machine approach within BOINC. But that project is not yet complete.


I know nothing about a (duration correction factor) Boinc is a stock install with no tweaking
Yes, and as this stock install it has problems known for long. One is this DCF issue which means that the run time estimate BOINC displays in the form of your progress bar is adjusted by (your stock install) BOINC manager depending on how much your machines spends on tasks different from the BOINC tasks. If you play a game while RNA World computes, this gaming is at cost of RNA World WU progress. BOINC detects that and re-calculates the remaining run time. As such a neat idea. In practice, however, when you stop playing, BOINC takes a long time to re-adjust this correction (factor). As a result, it displays run-time estimates that are far too high. And this is exactly what happened with you RNA World WU. One could say that BOINC actually 'betrayed' you in telling you the wrong run time estimate.

Well, I think it should all be a bit clearer now. For me it is clear that running RNA World on machines that do not run 24/7 is a problem as long as we do not have checkpointing. ;) And please note that I am by no means 'angry' or so because you have put forward some criticism here. The only thing I can do about this is trying to explain why we do the things we actually do and why some things are not yet implemented. :D

Michael.

LAURENU2
05-20-2010, 09:47 AM
Well thank you Michael
That was a vary clear and explanation of how RNA works within BOINC
And I know you are not upset Nor am I upset It is just that I can't Do Long WU's
And was worried about the 8/5 user's stuck in a Loop

As for the DCF only 3 PC's out of the 60 I run here are used by people the 58 are only doing DC 24/7
And the 3 do not run RNA because they need a reboot from time to time

I will continue to Run RNA but I will keep a eye out for long WU's and send them back to you Like this one

2473870 1059802 12 May 2010 21:06:46 UTC 17 May 2010 13:36:43 UTC Aborted by user 327,886.09 296,718.66 1,285.62

I know thats a lot of points to flush but my word is my honor

Michael H.W. Weber
06-18-2010, 09:11 PM
http://www.rnaworld.de/rnaworld/forum_thread.php?id=51

Michael.

Michael H.W. Weber
06-26-2010, 04:13 PM
Hi guys, so the conference turns out to be a remarkable success. We have met a lot of people that are very interested in our project. Even better, we have acquired new collaboration partners that can help us in experimentally validating our computetaional results on a broader scale. :D Also, posters presented by independently working groups indicate that we are absolutely on the right track with our analyses. So, I am quite confident that we will soon make it for the first publication in a scientific journal. Moreover, we have scheduled the next conference presentation of RNA World which will take place in September in Dresden/Germany. Details on that at a later time. :D

Michael.

Michael H.W. Weber
10-14-2010, 04:05 PM
We have released an OSX client today.

Michael.

Norman
03-28-2011, 09:43 AM
We are currently working on reorganizing the CMSEARCH application into two variants of which one will run a combined package of many very small WUs while the other runs the rest (the default CMSEARCH) WUs. With this we intend to achieve three goals. First, the combination of many tiny WUs into one package tremendously reduces the client-server communication load: a big relieve for our server which will make it significantly more stable even during team challenges and races. Second, the packaged WUs will write a checkpoint after each completed sub-WU. Third, because the users will be allowed to opt in or out for each of these two CMSEARCH variants there will be much more flexibility at the user's end to control maximum CMSEARCH runtimes.
Michael.

@tm a few of the old cmsearch-wu are in the pipeline and after that the project can do final changes. (choice: long runners or the shorties )
new apps with more performance are out for MAC OS 10.4 +, Linux and Win.
also we need urgend testers with MAC OS 10.4 and 10.5 with intel 64bit.
( best you write directly here: http://www.rechenkraft.net/phpBB/viewforum.php?f=74 )
happy crunching ;)

ps: you can see how many tiny wu are in a archive:
cms_GA-p[DZ-Lin64s]_7_Desulfitobacterium-hafniense-Y51_AP008230........
here are 7 in one archive.
can be over 100 or 1000.

Norman
03-29-2011, 09:09 AM
yoyo has doubled some old wus, to get these archives finished. But this means, that these tasks got the same HR class as the existing ones.
So some HR-classes are now blocking the send queue and must be send first, before you get new work. ;)
a few more 32-bit linux are helpful.

Norman
03-31-2011, 04:01 PM
any problems, bugs or suggestions?

LAURENU2
04-01-2011, 10:23 AM
any problems, bugs or suggestions?

Yes saving checkpoints

Norman
04-02-2011, 05:19 AM
hehe ;)
Oooh yes! that's a big request from me.
A man developing it and I hope as soon as possible done.

Norman
04-06-2011, 04:41 PM
Hello!
We will kill the remaining 10 introns WU likely.
The users will get full credits for the terminated work.
The runtimes can then be placed finaly in short (s) and long-XXL.
The few introns are then again under XXL.
When it happens there are also news to be read on the project's site.

Norman
04-07-2011, 12:45 PM
the introns should now have been broken off from the project and the user have get their credits.
Now is the separation of short (S) and long (XXL) is underway as it was being considered.

edit:
The remaining introns were canceled and credits should have been granted or are there still.
if this is not the case is please to announce.
Next, the cmsearch XXL adapted so that it functions as the S.
a few test-WU will follow and the introns are again redistributed among XXL.
WU-generator is also adapted for the big yet.
Right now, the XXL a test app (so who wants to allow)

Norman
06-07-2011, 01:26 PM
our database is crashed again and will repaired soon as possible..
sorry.

Norman
06-08-2011, 11:20 AM
database crash is over and now the new virus-WU are in the serverpipeline ;)

Norman
06-11-2011, 07:24 AM
it seems to be a little bit unclear for users about xxl and s and what they do/we mean.
so we have modified the apps-name to:

cmsearch XXL (large) 1.0.2
cmsearch S (small) 1.0.2

and we hope itīs now easy to understand what we mean with xxl and s ;)
is it better ?! any suggestion ?

Paratima
08-16-2011, 08:34 PM
Yes. Now how can I tell BOINC that I do NOT want to run the XXL types?

*EDIT: Never mind - set preferences on your site. Fixed.