yes and know the answer is on chip optimization, the only thing your missing is the optimization on the AMD which is not available.
The AMD is a very good rig and does have sse2 support, which makes it better than a regualr athlon XP's, however the clock speed of the athlon is low compared to the celerons etc.
In other words your ?~1800 mhz? Sthlon64 3000+ is doing very very well compared to a 1800 mhz P4, but poorly compared to a P4-3000.
Also your comment on the celery's compared to the P4, the clock speeds are the same but the cache is different. This goes to show that the cache doesn't help much, on the AMD in day to day applications it's huge athlon cache crushes that little celery right!! This is the main reason for the PR rating scheme from AMD and intel (532) etc.
Your best bet for the athlon is sieving, I suggest that you check out this section of the forum, download the proth client and reserve a range for that A64. I prefer sieving to prp, which is what 2.3.0 does.
Also for those P4's you have HT correct? ARe you running two instances of the program with the service install? This will help those P4's output by as much as 50%.
Let me know if you need a hand or have more questions