Page 2 of 2 FirstFirst 12
Results 41 to 47 of 47

Thread: Apple Mac PPC G5 proth_sieve update

  1. #41
    Moderator Joe O's Avatar
    Join Date
    Jul 2002
    Location
    West Milford, NJ
    Posts
    643
    Quote Originally Posted by Matt
    I thought some people were working on improving the x86 siever... no?
    Improving, Yes.
    It goes to 2^52, page faults less, has yet to go into a loop or leak memory.
    Joe O

  2. #42
    Quote Originally Posted by Joe O
    Improving, Yes.
    It goes to 2^52, page faults less, has yet to go into a loop or leak memory.
    And I heard there were significant speed improvements, is this true? Any idea when it will be out?



  3. #43
    Quote Originally Posted by rogue
    Clearly there are some caching issues. I have had no time to work them out. Alex has been doing the vast majority of coding and testing. He's been using my ASM routines and I've been pointing out some optimizations along the way. If I have time this weekend, I intend to dig more into the memory issues and will try to find a happy medium. BTW, I hadn't thought about prefetching. I will have to think about where it would make the most sense to do it.
    I figured out how to resolve the thrashing issue, but I don't understand what causes it. When I pull some of the functions out and put them into a separate file, I am then able to compile with -O3 and get a 5% gain over -O2. Is gcc inlining one of those functions? If so, why is it killing performance so badly. Maybe one of you guys know gcc much better than I and can answer the question.

  4. #44
    Senior Member Chuck's Avatar
    Join Date
    Aug 2005
    Posts
    406
    Blog Entries
    2
    Mark,
    Check the symbol table addresses of the functions in the external file for starters and see if you are getting page alignments. Conversely, pulling out the functions will allow the functions 'above' and 'below' the removed one(s) to potentially load and be put on the same page and/or fit in cache at the same time. Locality of reference with respect to cache is key.

    You have control of inlining with GCC as you know, and it shouldn't inline
    functions unless you have a default CFLAGS set to do so or specify it in code or on the command line. Check the '-03' flags set for your processor. It might do an 'auto inline' as you suspect. You can override it of course on the command line.

    The X86 version behaves differently, but still is largely 'cache centric'. Cache row size vs memory bus interface is the other. The number of cycles required to get a cache row into the cpu makes a big difference. The AMD and Intel versions have major differences right here. I suspect you are seeing similar.

    Email me if you wish and we can discuss off board.

    C.

  5. #45
    Chuck,

    Since Alex (who is aware of my findings) is leading the effort on the PPC port, I'll leave it to him to find a solution. To me it isn't an issue since this workaround works perfectly well.

  6. #46
    Unholy Undead Death's Avatar
    Join Date
    Sep 2003
    Location
    Kyiv, Ukraine
    Posts
    907
    Blog Entries
    1
    where can I download G5 siever???

    got a few macs around
    wbr, Me. Dead J. Dona \


  7. #47
    I am also interested in a version, but then for the Intel Macbook Pros.

    Thank you

Page 2 of 2 FirstFirst 12

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •