PDA

View Full Version : Linux vs. Windows: client speed



othello
01-11-2003, 05:07 AM
Hi,

one of my boxes running SoB is a AMD Duron 1200 / 256MB / Win98SE, another is a dual P3 1,13 / 512MB / Linux 2.4.16, running 2 clients.

I'd expect the Linux box to bee about twice as productive as the Windows box, but the Windows box is running at ca. 85 kcEMs/sec, while the Linux box is running at only ca. 75 kcEMs/sec (both clients together). Why?

You can take a look at my stats:

http://www.seventeenorbust.com/stats/users/user.mhtml?userID=773

The peak around 01/09 was achieved during an 48h burn in for the new processor on the Windows box, which is now offline.

cu,
othello

Scarblac
01-11-2003, 05:43 AM
Originally posted by othello
Hi,

one of my boxes running SoB is a AMD Duron 1200 / 256MB / Win98SE, another is a dual P3 1,13 / 512MB / Linux 2.4.16, running 2 clients.

I'd expect the Linux box to bee about twice as productive as the Windows box, but the Windows box is running at ca. 85 kcEMs/sec, while the Linux box is running at only ca. 75 kcEMs/sec (both clients together). Why?


I have a Duron 600 running Linux. It usually does about 45-50 kcEM/s, and the stats say "Equivalent power (est): 1.27 Ghz".

A friend's Athlon 1700 (Windows) has "Equivalent power (est): 3.18 Ghz".

It seems to me that AMD simply annihilates Intel for this particular application.

Kernel might matter though. My stats were around 38 kcEM/s with a 2.2 kernel, now I upgraded to 2.4.19 with Gentoo patches. But I have no idea about SMP.

othello
01-11-2003, 05:58 AM
Is there a way to calculate the cEMs/sec from the logfile of the Linux client?

I have an old P200 notebook running both Win98SE and Linux 2.4.16, so I could compare both OS on the same hardware...

othello
01-11-2003, 06:25 AM
I just did some fast math on the logs of my P200 / 96MB notebook, using the longest periods without interuption i could find:

Linux: ca. 9h / block (62:55:24 / 7 blocks)
Win98SE: ca 6:15 / block (56:16:11 / 9 blocks)

Looks like the clients are optimized for AMD / Windows. Perhaps that's what the developer of the client is running on his box? :-)

If I'm not mistaken, the project could gain lots of additional power just from optimizing the client for diffrent hard- & software...

cu,
othello

jjjjL
01-11-2003, 07:12 AM
All archs are optimized. AMDs run faster than P3s because their FSB is higher and their memory arch is better. P4s smoke even Atlons because their bus is faster yet and there are SSE2 optimizations. There are optimizations for all instructions though (3dnow, mmx, sse, sse2). If your proc supports it... it uses it.

As far as OS goes, I would be surprised if the windows client was faster. I've always assumed the linux client would be slightly faster. It amazes me to this day that no one has done a test on a dual boot machine with both linux and windows clients just to settle it. I would, but I don't have such a system.

-Louie

othello
01-11-2003, 07:27 AM
Originally posted by jjjjL
As far as OS goes, I would be surprised if the windows client was faster. I've always assumed the linux client would be slightly faster. It amazes me to this day that no one has done a test on a dual boot machine with both linux and windows clients just to settle it. I would, but I don't have such a system.

-Louie

I could do this test on my notebook, but to get adequate results i'd have to use the same k and n on both clients, right?

cu,
othello

Firebirth
01-11-2003, 08:33 AM
It amazes me to this day that no one has done a test on a dual boot machine with both linux and windows clients just to settle it. I would, but I don't have such a system.

well... in the Linux client, there is really no easy way to calculate the cems/sec. Or is there?

Pascal
01-11-2003, 10:33 AM
Originally posted by jjjjL
..
As far as OS goes, I would be surprised if the windows client was faster. I've always assumed the linux client would be slightly faster. It amazes me to this day that no one has done a test on a dual boot machine with both linux and windows clients just to settle it. I would, but I don't have such a system.

-Louie [/B]

I could test this on my machine (Athlon 1.2 GC/s, Thunderbird C), if you want.
I have installed Windows XP and SuSE Linux version 8.1 (Kernel 2.4.19)

Just tell me ;-)

Mystwalker
01-11-2003, 10:36 AM
Assuming that the Linux version also uses 250M cEMs for 1 block, it should be no problem to calculate the average speed.

Pascal: Just do it. I think there's enough interest here. :)


BTW:
I'm using a Duron @ 900 MHz and a mobile P3-m @ 1 GHz respectively 733 MHz in SpeedStep.

At 1 GHz, the P3 is slightly faster than the Duron. When it's running @ 733 MHz, it doesn't loose 26,7%, though, but considerably less - maybe 10-15%.

Seems like L2 cache (and/or SSE) is used intensely.
A direct comparison Athlon <--> Duron should provide an answer to this guess...

othello
01-11-2003, 12:06 PM
Originally posted by Mystwalker
Assuming that the Linux version also uses 250M cEMs for 1 block, it should be no problem to calculate the average speed.


With the figures I posted earlier this would mean:

On my P200MMX / 96MB notebook the performance is

7725 cEMs/sec for Linux 2.4.16

and

11140 cEMs/sec for Win98SE

which means factor 1,44 for Windows / Linux.

Can someone please check this out on other Hardware?

cu,
othello

Mystwalker
01-11-2003, 12:45 PM
When someone could tell me how to insert the block the Linux client should use into the config file, I could make that test, too.

Firebirth
01-11-2003, 12:47 PM
For Linux - what flags was the compiler given when compiling the client; and in what compiler + version was it done?

othello
01-11-2003, 01:29 PM
Originally posted by Mystwalker
When someone could tell me how to insert the block the Linux client should use into the config file, I could make that test, too.


For the calculations above I used the logs for different tests:

Linux: k=22699 n=1438174
Windows: k=21181 n=2041580

If this matters in any way, I may be totaly wrong!

I just didn't expect such a big difference...

cu,
othello

Mystwalker
01-11-2003, 01:53 PM
Linux: k=22699 n=1438174
Windows: k=21181 n=2041580

Yikes, that explains a lot.

Although cEM already is a corrected value (indicated by the "c"), it's doesn't even out the increasing effort of higher n's completely. To be specific, it's too strong as on the same system, cEMs/sec is usually higher the bigger the n. This usually isn't a big problem as the difference between the n's of successive computed work units is generally not very big. But a difference of 600,000 does indeed make a noticable change.

AFAIK there will be another measuring unit soon.

MAD-ness
01-11-2003, 03:53 PM
Any comparison test run using different k and/or n values is going to be highly, highly suspect for performance comparison purposes.

On windows you could probably force a specific k/n pair by hacking the registry files (I haven't tried this, but I think it might work) and on Linux I am not sure which files you would need to edit to force a specific k/n pair to be tested.

Rather than worry about cEMs, why not just run a test with a smaller n value and let it complete the test?

Troodon
01-12-2003, 05:50 AM
Originally posted by jjjjL
All archs are optimized. AMDs run faster than P3s because their FSB is higher and their memory arch is better. P4s smoke even Atlons because their bus is faster yet and there are SSE2 optimizations. There are optimizations for all instructions though (3dnow, mmx, sse, sse2). If your proc supports it... it uses it.


Does it use prefecth instructions?
I would like to able to see somewhere in the client something like "Detected XXX processor" "Using YYY, ZZZ optimizations".

shifted
01-18-2003, 02:26 PM
Originally posted by Firebirth
For Linux - what flags was the compiler given when compiling the client; and in what compiler + version was it done?

This doesn't matter as the program spends almost all the execution time in the assembler core.

jjjjL
01-18-2003, 04:20 PM
yes prefetch instructions are used.

-Louie

RangerX
01-19-2003, 01:22 PM
I posted about this in the OTHER SB forum (don't know how I ended up there instead of this one), but I'm copying a bit of it here...

On a side note, I seem to have lost the Windows vrs Linux thread, but I have some evidence to contribute:
http://chaosworks.topcities.com/personal/pics/user24.jpg

As you can see, while running Linux (L) I was consistantly below the Windows (W) production levels. I have no idea why, but there's the proof. This is one 1gHz Athlon Thunderbird with 512 RAM (if RAM matters). On the good side the user statistics is telling me I have a 1.66-2 gHz computer :D :thumbs: Unfortunately I don't have a massive computer farm to throw into the project or else I would.

EDIT: Of course these were with different values (see my post on the other board for info), but I think a difference of that much HAS to be based on more than just increasing complexity.