[sldev] getting serious about software.

Callum Lerwick seg at haxxed.com
Fri Jun 22 20:32:24 PDT 2007


Don't underestimate the compiler's ability to optimize. I did a little
writeup on my LJ:

http://ninjaseg.livejournal.com/49842.html

Lesson: Code what you mean, "tmp / 2", and the compiler can optimize it
in the best way for the platform you tell it to target.

Though notably, gcc 4.1 is too dumb to vectorize an integer divide on
its own. But on the other hand, vectorization hasn't proven to be a
benefit in this case. MMX/SSE really is of little benefit in a loop with
a single divide, in fact it seems to slow it down a little. Moving
things in and out of the registers is the bottleneck. Vectorization is
really only a benefit if you can keep it all in the registers and do a
large number of operations in a row.

The cleanup on the way to vectorization has resulted in a measurable
speedup though.

But the biggest speedup by far has been from reducing cache pollution
and memory overhead.

> --- Write custom code for your math ---
> All modern CPU's support SIMD instructions aimed at 3D. Compilers
> don't convert your vector math for you, you need to do it yourself.

Compilers are actually starting to be pretty good at autovectorization.
You still need to design the code with vectorization in mind. In gcc's
case, it is very picky about aliasing, basically requiring C99
"restrict" to be used on all pointers, and assigning structure members
to temporary variables...

I've experimented with Intel's compiler, but I can't actually get the
thing to link right on Fedora 6/7, so I've only been able to look at its
assembler output.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.secondlife.com/pipermail/sldev/attachments/20070622/1157f71a/attachment.pgp


More information about the SLDev mailing list