PDA

View Full Version : Howard, can you respond to this?



[DSLR]MntlCase
03-31-2002, 11:20 PM
From a DOCTOR who posted at DSLR.

Keep in mind, not a registered member, but apparently one of the sponsors of the TSC project.

(You can pretty much ignore everything but what I have bolded. The rest applies to previous things that were said/happened at DSLR)


yikes,

I did not know the rules. Did not know only one team could support. or only talk about in one place. I know science, I know how to make drugs, I know how to help people. I don't know rules of posting to bboards.

I only wanted people to help since this is the best designed distributed computing project in terms of getting real results to treat real people and we work with the people that made lipitor, discovered taxol, cured Hodgkin's and ran the NCI.

It is good science, and we will follow-up.

What are the rules?

If I wanted to be harsh, I could tell you computer folding does not work, i don't think you will find E.T. and most public drug finding projects take public targets that Merck has already made drugs against. I know my stuff. We want to take computing power and save lives. Not-for-profit (I've made enough money).

But I don't know the rules of posting to this site.

Jonathan

[DSLR]MntlCase
03-31-2002, 11:38 PM
For the sake of me cutting and pasting EVERYTHING Dr. Rothberg, (Yes, THAT Dr. Rothberg), has said, here is the link to his other responses. As well as mine.

http://www.dslreports.com/forum/remark,2906605~root=dist~mode=flat

Dyyryath
03-31-2002, 11:53 PM
Interesting comments in an interesting thread, though my expertise lies too far from bioinformatics to really draw any conclusions based on what he had to say.

I'd be interested to hear what Howard thinks, too.

Thanks for posting that, MntlCase.

[DSLR]MntlCase
03-31-2002, 11:58 PM
Originally posted by Dyyryath
Interesting comments in an interesting thread, though my expertise lies too far from bioinformatics to really draw any conclusions based on what he had to say.

I'd be interested to hear what Howard thinks, too.

Thanks for posting that, MntlCase.

Mine too Dyyryath, that's why I'm asking an expert. :)

Man, I wish I understood more about what exactly we're helping. :)

Maybe I should go back to school and major in Biology?

jkeating
04-01-2002, 09:31 AM
I'm curious why 10,000 users by May 26. I suspect May 26 will be the official launch of the project, but how did they come up with 10,000 users? Number of users hoped for at the start? Or for the project?

One of the best things that this organization could do is get some general media publicity to bring new blood into DC. All of the existing teams/groups are getting spread pretty thin amongst the current DC projects. The reason the UD project with 1,500,000 users got so big so fast was the huge amount of publicity at the start of that project. I heard/read about it from multiple sources.

BTW, Dyyryath, I think we'll need yet another forum soon.

FEEDB0B0
04-01-2002, 12:00 PM
Hi Folks,

Christopher Hogue here, I'm Howard's Ph.D. advisor and the scientist behind the Distributed Folding Project. I don't usually post much, but when I do, it is usually long, so apologies in advance...

Thanks to all of you for getting us to another BILLION structures on BPTI, protein number 3. :D

We are close to starting our first 10 Billion run, two more small proteins to go.

It is always interesting when other scientists weigh in on topics like these. Dr. Rothberg's comments are interesting, and he certainly has a lot of knowledge about conventional protein structure and drug docking methods. I don't think he has read any of the scientific information on our web site (our papers). I expect that he has not internalized the implications of our new algorithm as it is an "embarassingly parallel" implementation and, as such, changes the landscape considerably.

We are already doing protein structure sampling faster than anyone and beating all previous attempts on our first few proteins - but wait for it - we have to get it published as a scientific work for it to be accepted by scientists and medical doctors like Dr.Rothberg. Those are the rules we scientists have to play by. While "FUD" (fear, uncertainty, doubt) is parlayed in the computer industry agressively to knock off competitors, it is a natural part of the scientific method and something we are quite well used to. We are very happy to wait till our experiment is published and prove our work to the naysayers. I've been building this algorithm and method into an "embarassingly parallel" framework since 1987 and we are well on our way to proving our method, thanks to your help. :)

Well, it sure looks like Rothberg is trying to recruit members to TSC and get to 10,000 users for a drug docking method. I admire Dr. Rothberg's passion for distributed computing. He may be right in suggesting that drug docking is closer to "saving lives" than is protein folding. But it is always basic science that is the foundation for all other science.

For example, our algorithm has uses in helping improve drug docking. How? Well proteins move quite a bit. Most drug docking systems don't move the protein at all, they just try out thousands to millions of drugs on an immobile structure. Most drug docking software does not reproduce experimental results. The experimental results show that the protein, in fact, moves slightly between different drugs. So the computation should move the protein, but it doesn't :rolleyes: . We can help by making hundreds of minor variant proteins (within 1-2 Angstroms RMSD) on a single PC in one day, simulating random motion. Those can be used with the drug docking algorithm to make a statistical profile of how good the best drug would bind to the protein taking its motion into consideration. This should give better data, but nobody has tried it yet. The method is in our papers and we have other software and tutorials on our site that does this exact chore.

We would be happy, of course to collaborate with another distributed computing group on the topic of drug docking to see if a "statistical" approach would improve their work. That is what basic research offers to applied research - incremental improvements in the methods, and that is why everyone benefits from supporting both kinds of projects.

We focusing on the question of how big a sample do we need to find a protein fold that is close to correct. Once we know the answer, we can move on to making reliable "corarse" predictions and focus on improving them. With expected throughput of over a trillion structures a year Distributed Folding will easily outcalculate Blue Gene on a 100,000 cpu user base in terms of protein fold sampling. Why?
Well one reason is that Blue Gene won't be here for another couple of years :p

But the real reason is that we are using a different, novel algorithm than the one generally accepted and commonly used. Plainly stated, we can sample protein space without paralell computers of the IBM Blue Gene kind. We are "embarassignly parallel", and I'm not embarassed to say so :eek: .

Having said that, the most interesting future use of the DF algorithm may be working *in tandem* with the IBM Blue Gene machine, as a kind of preprocessor.

The best structures from DF could get fed into Blue Gene as starting points, making it have to do less work to get to the finished structure. I have discussed this at length with IBM's Joe Jasinsky, who is in charge of their research team, and that may be why IBM scientitsts have asked me to be on the Blue Gene Advisory Board :cool: .

The Folding@Home project uses a similar kind of code at its core (molecular dynamics) as intended for Blue Gene. Thus we could work in tandem with their project as well. However they have issues in that their code implementation will, apparently, not scale to proteins that are as large as most disease proteins. Our third protein BPTI at 58 amino acids is larger than most of the ones they have studied, and we can compute structures involving thousands of amino acids. My hope is that they can reimplement their project to allow it to work with larger proteins, then we may be able to collaborate.

In the meantime we will accumulate data on large proteins. Our results will be valuable for both Blue Gene and Folding@Home about how well scoring functions work at selecting the best protein folds.

Distributed Folding's approach has been to work on a pretty fast time pace to get as much of our software engineering right and to satisfy any unforseen user requirements using a base of sophisticated, dedicated users - you!

We intend to have a big publicity push for scaling up for the masses, but only after we finish a new server architecture suitable for 100,000+ users. This work is already underway, and I recently received good news about a successful grant renewal from the Natural Sciences and Engineering Council of Canada (NSERC) that will let me add another full time person to the project and pay for any increases to bandwidth we may need.

You may not know that we have never advertised this project except for a few meetings and a post on the beowulf cluster mailing list. We have gotten every single user so far by "word of mouth" (or borging... who knew?). When Howard and I lose a user it is because we haven't satisfied their needs as a DF project. This focuses us on getting the bugs beaten out of the project. We have listened to all of your requests and try to deliver satisfaction in regards to the server, the software, and what we think will be an exciting project, especially as "CASP" season heats up.

CASP stands for the Comparative Assessment of Structure Prediction, a large scale blind test of protein folding that weeds out the hype from the real capabilities. If you've read the About | Science page on our website
http://distributedfolding.org/science.html
you will know that we are going to be participating in the CASP-5 protein structure prediction experiment this year.

The CASP-5 website is here:
http://predictioncenter.llnl.gov/casp5/Casp5.html

A terrific WIRED article about CASP and Blue Gene is here:
http://www.wired.com/wired/archive/9.07/blue.html

(I'll get Howard to post links to these on the DF web site.)

Howard and I were both at the CASP-4 meeting described in the article, lurking in the shadows, planning on how to convert our two second place CASP-4 placements (out of only 4 predictions we submitted that year - sample size = 200,000! ) into a sweep for CASP-5. ;)

We may be the only group participating in this true blind test of protein folding using distributed computing. We will probably have the largest computing force working on CASP-5. So we will know at the conclusion of CASP-5 later this year, just how much an impact distributed folding and embarassingly parallel brute force algorithms have on the field of protein structure prediction.

As far as whether computers can fold proteins, I think the WIRED article speaks volumes as to the progress that is being made in algorithms prior to "going big" in terms of the computing. It is a long article, but written for the layperson so give it a read if you have a moment.

And btw, Blue Gene wont' be ready to play until CASP-6 (2004).
Fold on! We will be the only game in town going big for CASP-5 and we are doing it as we speak. WIRED will be writing about us this year, just wait for it...

Christopher Hogue :cool:
Senior Scientist, Samuel Lunenfeld Research Institute
Mt. Sinai Hospital and Assistant Professor, Dept. of Biochemistry
University of Toronto
http://bioinfo.mshri.on.ca

Dyyryath
04-01-2002, 12:13 PM
Hey Chris, thanks for the post! Very articulate and informative (you must have an education or something ;) ).

xj10bt
04-01-2002, 02:30 PM
Thanks for the post Dr. Hogue. I'd like to put in a plug for Howard, he has been extremely responsive to our questions, complaints, and whining ;)
If there is a sure way to make a distributed computing project popular that is it!

IronBits
04-01-2002, 02:41 PM
Well... that and *nix clients ;)
Good job Howard!

[Ars]KD5MDK
04-02-2002, 03:12 AM
Including OS X. It's what won me to this project, where all the others are lacking.

I found that explanation of the science here very interesting. I'm particularly interested in the comments of how Distributed Folding compares to Folding@Home, because F@H is well entrenched in the DC community, and it would be a feather in us innovator's hats if it was provable that Distributed Folding had better science behind it. Maybe we could even convince some to join us ;)

pointwood
04-02-2002, 04:27 AM
Thanks for that post Chris - very interesting - even for a person like me that doesn't understand a damn thing about folding :)

Michael H.W. Weber
04-02-2002, 08:47 AM
Originally posted by [Ars]KD5MDK
Including OS X. It's what won me to this project, where all the others are lacking.

I found that explanation of the science here very interesting. I'm particularly interested in the comments of how Distributed Folding compares to Folding@Home, because F@H is well entrenched in the DC community, and it would be a feather in us innovator's hats if it was provable that Distributed Folding had better science behind it. Maybe we could even convince some to join us ;)
Well, we are among you guys since the initial start of DF (team rechenkraft.de) and we have given our best to bring this project to a lot of people's attention in Germany by announcing it appropriately on www.rechenkraft.de - even though we originally fold for the Folding@Home project. :D

I think it is useful to state that we do not see a true competition between the Folding@Home (FAH) / Genome@Home (GAH) and Distributed Folding projects. All these projects have advantages and disadvantages and - importantly - focus on different aspects of protein folding. :)
What has impressed my a lot with DF is Howard Feldman's quick response to user bug reports and the good documentation of the DF website. The user support provided by DF is of outstanding quality and in every way superior to the FAH/GAH project(s). ;)

Keep up the good work guys,
Michael.

P.S.: Even FAH finally has a OS X client out since a couple of weeks now.

FoBoT
04-02-2002, 09:34 AM
i'll add some more compliments for howard, great job!

the perception in the DC community is that some projects ignore their participants, howard on the other hand, has been a completly different story.
very helpful and extremely quick to correct errors and provide answers to the users

A+++++++++ for howard!!! :)

i have no idea about the science behind any of these projects, i approach them more from a useability viewpoint
i have access to pc's at work and home, depending on their "other" uses, i select a DC client that is best for that boxen, depending on connectivity, OS, other use, etc
as such, i participate in many projects at once, regardless of the "end" in mind by the project :rolleyes:

MAD-ness
04-03-2002, 11:29 PM
Howard has been incredibly approachable, responsive, reliable, helpful and active in this project and that is, IMO, what has allowed the project to retain such a high activity level even during this alpha testing.

This is a small, word-of-mouth (as Dr. Hogue pointed out) community that already features a very stable, robust and configurable client that can be deployed on a wide range of systems and in varying computing environments.

As I learn more and more about the project (just recently stumbled onto the TraDES page) and Howard and Dr. Hogue [just think Howard, one of these days we will have to start calling you Dr. Feldman! ;)] post more and more information on the forums and the project web page, I become more and more excited about the science aspects of the project.

I have a personal interest in genetic research and anything related as my sister has a very rare and very extreme genetic disorder. In fact, most doctors that have treated her or whom I have spoken with and asked about the disorder (not positive that it is a disorder) are not familiar with it and VERY few have any first hand experience with actual patients. It is called Trisome 13 (not sure if I spelled trisome correctly, the condition is the result of having 3 of the 13th chromosone). Fatality rates are extremely high, life spans are incredibly low and I am pretty certain that the actual disorder can not be "corrected" - atleast not outside of science fiction and at a very early stage of development even then. :(

Having spent more time in Children's Hospitals than any person possibly can while still retaining thier sanity, I have a little bit of an understanding of just how many diseases, disorders, afflictions, maladies, etc. there are out there and the incredible amount of resources that are required to make steady and signifcant progress in the study and treatment of any particular illness. These resource requirements are so massive that current medical research and current funding for said research is nowhere near sufficient.

What is so alluring about this project is that the desire of the participants (especially amongst the major 'teams') to optimize thier contributions is shared by the project leaders. I love making waves and the thought of providing you guys with a lot of ammo for your papers and presentations, especially in regards to the CASP projects, has me salivating.

I told my mother that I was running this project over Easter, trying to explain it to her and as I rambled on about the focus and the possible applications her eyes lit up. She knows that no medical research will help my sister, but she, too, has a special interest in anything that might result in even one less patient having to be at the children's hospital the next time we get the call to rush down there. She said to me "I didn't know you could do things like that with computers." I told her "You can and we are."

Distributed Folding, G@H, F@H, UD Cancer, TSC - it doesn't matter. I will do my best to help the DC community grow so we can make them all go begging for additional bandwidth and make thier servers wimper at our Work Unit requests. =)

Sorry for the long, personal, only slightly on topic post. I saw Fyndhorn Kroog (I know I messed up that spelling) asking on the above referenced DSLReports thread if TSC might possibly benefit research into the disorder that his daughter has and it really enforced my decision to participate in this project as well as other 'science' projects. I thought maybe my own story (ramblings) might help bring a little bit of a 'real world' element to the thread and perhaps convince any of you big producers who maintain a large number of machines/clients and at times might ask yourselves "is this really worth it?" that the answer is YES.

jbcool
04-08-2002, 01:54 PM
I want to than Christopher Houge for the information that he has posted to explain to us common lay people and for all the hard work of Howard and the others involved.

All my spare processing power has been move to this project because of the efforts of these people.

Shaktai
04-08-2002, 10:14 PM
Ditto to all of the above. Dr Hogue's explanation was excellent. Very understandable, and Howard's dedication is second to none. This is the first DC project where I have really felt close to the project itself and felt I was being kept informed and having my concerns addressed. (Ubero runs a close second if they get some real work again). The entire Distributed Folding team has been great.