CUDA client now available!

Printable View

Show 50 post(s) from this thread on one page

11-26-2008, 02:32 AM
alpha

CUDA client now available!

There is now a CUDA client available for AMD64 Linux on the pre-release download section. Currently only RC5-72 is supported.

Post your benchmarks here!
11-26-2008, 10:34 PM
Brucifer

so being totally ignorant on the nvida graphics cards, what will this work on. What's the power draw? What size PS does the system need? What cuda stuff that they mentioned on the dnetc site needs to be installed?? Is anyone here trying this stuff out? Can multiple cards be loaded into a system? Etc., etc., blah blah....:corn:

It's only money................
11-27-2008, 02:33 AM
alpha

I don't have any practical experience with this stuff yet, but I might be able to answer some of your questions..

The distributed.net pre-release download page linked above says that you require "CUDA libraries and drivers", which I'd guess you can get here.

According to Wikipedia, CUDA will run on:

Quote:

CUDA works with all NVIDIA GPUs from the G8X series onwards, including GeForce, Quadro and the Tesla line. NVIDIA states that programs developed for the GeForce 8 series will also work without modification on all future Nvidia video cards, due to binary compatibility.

A full list of CUDA-enabled hardware can be found here.

I picked a card at random - NVIDIA GeForce 8800 GT and checked on nvidia.com for the technical specs. The maximum power draw is 105W and states the minimum required PSU is 400W.
11-27-2008, 02:55 AM
Brucifer

Well as I suspected, I don't have a compatible card. So I think that I'll wait a bit before I purchase one and see how things go with those knowledgeable and brave enough to venture forth and experiment. :)
11-30-2008, 05:07 PM
Bender10

I just cranked up my *nix 64, GPU box on this project. Brucifer got me curious..

AMD 9550
9800GX2

I never ran this project before. This is more of a test, than anything else.

It is running about 470 Mkeys/s. Is this Fast?? Average??
11-30-2008, 06:53 PM
Brucifer

well it's all relative... :) But how long was the system running when it gave you that mnodes number in the summary?

Since you were just testing today, and it was most likely just one system, I'd have to say you were ripping along at a good click. :)
11-30-2008, 07:48 PM
Bender10

Yeah,

I guess jumped in with a number too fast, anyway...

To make a better test of it, I suspended the PS3grid wu's, to take the extra load off the GPU's. Boinc is still running, Just no GPU tasks.

The 'work-o-meter' in the Dnet client went down from 470 Mkeys/s to a steady 225 Mkeys/s, with 2 GPU tasks crunching.

The 'work-o-meter' has started to go up again, with the GPU tasks suspended. Right now it is at 270 Mkeys/s and climbing. This setup is completing ~50 packets every 5 minutes. I'll keep track for a couple of hours to get a better idea.
12-01-2008, 08:52 AM
Death

can you run "dnetc.exe -bench" and post results here.
12-01-2008, 09:25 AM
Bender10

Ok,
I am kind of a dnet noob.

Right now I am running Boinc wu's, PS3grid (gpu) wu's and RC5 on the same box. The RC5 output is reduced due to having to 'share' the gpu with PS3grid (which is also taking a performance hit). So I did 2 '-bench' runs. With RC5 and PS3grid wu's sharing (A). And RC5 running solo on the gpu (Boinc still running) (B).

A results:

[Dec 01 14:08:18 UTC] RC5-72: using core #0 (CUDA 1-pipe).
[Dec 01 14:08:37 UTC] RC5-72: Benchmark for core #0 (CUDA 1-pipe)
0.00:00:16.11 [197,038,435 keys/sec]
[Dec 01 14:08:37 UTC] RC5-72: using core #1 (CUDA 2-pipe).
[Dec 01 14:08:55 UTC] RC5-72: Benchmark for core #1 (CUDA 2-pipe)
0.00:00:16.19 [172,660,517 keys/sec]

B results:

[Dec 01 14:10:59 UTC] RC5-72: using core #0 (CUDA 1-pipe).
[Dec 01 14:11:17 UTC] RC5-72: Benchmark for core #0 (CUDA 1-pipe)
0.00:00:16.14 [241,531,335 keys/sec]
[Dec 01 14:11:17 UTC] RC5-72: using core #1 (CUDA 2-pipe).
[Dec 01 14:11:37 UTC] RC5-72: Benchmark for core #1 (CUDA 2-pipe)
0.00:00:16.84 [202,234,189 keys/sec]

I hope this helps.
12-02-2008, 03:58 AM
Death

check http://distributed.net/speed/

you are 10 times faster than xeon ))
12-02-2008, 12:29 PM
Bender10

Good to know,

I just wanted to test out the gpu setup since I had a system that would run it (and someone else got me curious :D).

I'm building another *nix box to try out a different gpu...
12-03-2008, 10:10 AM
Death

Bender, can you check the possibility to crunch rc and ps3grid in parallel.

I mean that if you running 2 projects rc5 keyrate not HALFED. so can you check ps3grid speed with rc5 running in background?
75% of one project + 75% of other project means that you use your GPU at 150% ))))
12-03-2008, 09:30 PM
Bender10

I already checked that out. I have been running PS3grid most of the time with RC5.
1. run 2 grid wu with rc5 on both cores.
2. run 2 grid wu with rc5 on 1 core. completed ~2000 in 1 day.

When crunchig rc5 and ps3grid (at the same time), the rc5 key rate was down ~50%. the PS3grid total run time went up ~40%.

*disclaimer...ymmv

These were Short term test.

I just shutdown my ps3grid wu's to get a 24 hr avgon just rc5.

Edit: I'll fire up 1 grid wu after the 24 hour run. And I'll be more careful about the numbers I use....(noob here, remember??). It seems that in the summary, the key rate is an average?? of keys processed?? I just picked up on that.
12-04-2008, 03:46 AM
Death

well I look at bench

241,531,335 keys/sec / 197,038,435 keys/sec = 80% not 50%
12-04-2008, 10:57 AM
Brucifer

Bench is fine for exactly what it is ----- bench.

Bender10 is posting what he actually is getting in actual use which is the real world application of the client. Sort of like Detroit advertisement for gas mileage of a car which normally is nothing like what really happens with the car under actual loading conditions. :)
12-05-2008, 08:08 PM
em99010pepe

Is there any windows client?
12-06-2008, 02:25 AM
alpha

Nope, doesn't seem so. However, if you can get the source you could try compiling it yourself.
12-06-2008, 08:27 AM
em99010pepe

Quote:

Originally Posted by alpha

Nope, doesn't seem so. However, if you can get the source you could try compiling it yourself.

I'm too lazy to do that.

Carlos
12-07-2008, 02:22 PM
Brucifer

Well I decided I couldn't wait.... :) So I ran out to my local friendly "Best Buy" store and picked up a 9800GT. Have to say I'm pretty impressed. Running it on 64-bit linux and crunching rc5. Anticipating somewhere between 5600 to 5900 completed units in 24 hours, just from the gpu. Definitely moving along it is... :) If one is in to crunching rc5, it's a good little item to have.

Now all I need to do is figure out how I'm going to get one of those super Tessla setups... :rotfl:
12-07-2008, 04:34 PM
IronBits

Wow! Nice boost eh? :)
12-07-2008, 05:39 PM
the-mk

yes, sounds nice :thumbs:
btw: how many does a PS3 do in 24 hours?
and does the CUDA client also support 2 GPUs (SLI, etc)?
12-07-2008, 06:42 PM
Brucifer

Quote:

Originally Posted by the-mk

yes, sounds nice :thumbs:
btw: how many does a PS3 do in 24 hours?
and does the CUDA client also support 2 GPUs (SLI, etc)?

IRT 2 gpu's, yes. Thats basically what Bender10 is running, is a dual gpu card. But if you go look at the higher end nvida stuff out there, they have multiple gpu systems that are all cuda supported. The fancy Tessla systems are something else to behold :) and so is the price! LOL In the hard core gpu world, we are at the low end of things. They have really boosted some stuff along for scientific and heavy duty graphics stuff for engineering, etc. But most of that stuff is way out of reach for the basic home cruncher.

As for the PS3 output, I don't know myself. I seem to remember IB mentioning something like 3,000 a day. Maybe he will pipe up with what he has gotten out of them.
12-07-2008, 06:54 PM
IronBits

I'm getting about 9T, or 8,666 blocks, per day, per PS3.
12-07-2008, 07:44 PM
IronBits

That Tesla C1060 card has 3GB more ram than a GTX280, same amount of processors, and bandwidth is higher. power consumption is much less, but costs $1,695 each.
The GTX280 can be had for much much less...
You could put 4 GTX280s in SLI mode into a whole newly built computer and save money. :)
12-07-2008, 10:58 PM
IronBits

After doing more research on the difference on both of these cards, in relation to performance, there is no advantage to using a C1060 over a GTX280, it's the exact same 'card'.
The C1060 card has no video out port so you can not hook it up to your monitor, but, you will also pay four times as much for that missing feature. ($1695 vs $400)
12-08-2008, 01:27 AM
Brucifer

Interesting. So if no video, then do you load video drivers to run it, or what????
12-08-2008, 01:51 AM
IronBits

Yes. You can run a real video card in the same computer, hopefully at least an nVidia brand, so they can share the same video driver ;)
12-08-2008, 03:21 AM
Death

i read tread at /., and it says tesla is a 8800 nvidia. just without video output, and optimized for running 24/7
12-08-2008, 12:13 PM
Brucifer

Quote:

Originally Posted by IronBits

Yes. You can run a real video card in the same computer, hopefully at least an nVidia brand, so they can share the same video driver ;)

That make sense.... :) Duh :)
12-22-2008, 11:04 PM
Brucifer

So now there is a second cuda beta out as the first one expires in less than 12 hours. The first one would look in the local directory for the libcudart.so.2 library files. However the second one doesn't look in the local directory for the files. So does some enterprising soul know how the "real" library files are installed in to ubuntu 8.10 without compiling the whole cuda sdk kit? In looking at one ref out there, they said that the latest cuda wasn't supported by ubuntu 8.10, and that to stuff it in anyway, one would have to install an earlier version gcc and such... It would seem that there would be an rpm file out there in the big cloud to dump the libcudart files in for those that aren't doing software development and just want to run a precompiled client. Otherwise it's easier to just give the gpu's to the local gamer phreaks and go back to hard core 4x4 wheeling... :eek:

:confused:
12-22-2008, 11:29 PM
IronBits

I have no idea... I'm going to wait until it's out of beta, or at least a beta that will last longer than 12 hrs. ;)

Quote:

Originally Posted by Death

i read tread at /., and it says tesla is a 8800 nvidia. just without video output, and optimized for running 24/7

That would have been the older/first Tesla card they made.
The new one is built from the 280...
12-23-2008, 03:24 AM
Brucifer

Per AMDave, copy the libcudart.so.2 file into the /usr/lib64/ directory.

I might add that if you had selected a specific core and done away with the auto selection to save time, set it back to auto and run it a few times to verify which new core each gpu likes.
12-23-2008, 05:49 AM
Death

well, there IS windows client )) but it not even in pre-release stage. chech this topic. I really made a miss...
http://www.free-dc.org/forum/showthread.php?t=17078 topic title should read WINDOWS CUDA CLIENT.
12-23-2008, 08:39 AM
gopher_yarrowzoo

Post Title Edited :)
12-24-2008, 03:50 AM
Death

pre release 508 avail.

windows cuda on it's way!
12-25-2008, 08:33 AM
IronBits

Distributed.net released a BETA of the Windows CUDA client.
If you have an nVidia GeForce 8000-series or later card, it will provide a significant speed increase.
I'm getting over 500 MKeys/sec on my GTX-280.

The client is available at http://www.distributed.net/download/prerelease.php

It takes about 7 seconds to complete a wu now. ;)
TaskManager shows dnetc running at 11-13 so almost no cpu impact on other projects using cpu only.

With Boinc running and using my browser, I'm getting 526.37 Mkeys/s :D
Thats one wu completed about every 7 seconds :)
12-25-2008, 12:53 PM
IronBits

I get 545 Mkeys/s using dnetc.com instead of dnetc.exe.
12-25-2008, 01:32 PM
IronBits

8800GTX gets 248 Mkeys/s
12-25-2008, 01:41 PM
Brucifer

Nine seconds on a GTX260
01-11-2009, 02:50 PM
TheOtherPhil

My lowly 9600GT gets 171Mkeys/s

Show 50 post(s) from this thread on one page

All times are GMT -4. The time now is 03:41 PM.